Chitin degradative systems

ABSTRACT

The present invention relates to chitin degradative systems, in particular to systems containing enzymes that bind to and depolymerize chitin. These systems have a number of applications. The present invention also describes enzymes with at least two catalytic domains in which the domains are separated by poly-amino acid linkers.

CROSS-REFERENCE TO RELATED APPLICATIONS

This claims priority to U.S. Provisional Application No. 60/483,135, filed Jun. 27, 2003 and U.S. Provisional Application No. 60/483,383, filed Jun. 27, 2003, the contents of which are incorporated herein, in their entirety, by reference.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The invention is generally directed to degradative enzyme systems. In particular, the present invention is directed to chitin depolymerases and associated proteins and enzymes found in Microbulbifer degradans.

2. Background of the Invention

Chitin, a homopolymer of repeating units of β-1,4-linked N-acetyl-D-glucosamine (GlcNAc), is the second most abundant polymer in the biome. It is found in various forms throughout the marine environment and is a component of crustacean and insect exoskeletons, yeast and fungal cell walls, and diatoms. Chitin is usually at least 90% acetylated and is often in a complex with proteins and other carbohydrates. The microcrystalline structure of chitin varies between antiparallel sheets (alpha chitin), parallel sheets (beta chitin), and a mixture of both (gamma chitin). Alpha chitin is found in the calyces of hydrozoa, mollusks, plankton, and as a component of the cuticles of arthropods. Beta chitin, a less stable and more degradable form of chitin, is found in mollusks, squid pen, diatoms, and insect exoskeletons and cocoons, and is the major component of fungal cell walls.

Microbulbifer degradans strain 2-40 is a marine γ-proteobacterium that was isolated from decaying Sparina alterniflora, a salt marsh cord grass in the Chesapeake Bay watershed. Consistent with its isolation from decaying plant matter, M. degradans strain 2-40 is able to degrade many complex polysaccharides, including cellulose, pectin, xylan, and chitin, which are common components of the cell walls of higher plants. M. degradans strain 2-40 is also able to depolymerize algal cell wall components, such as agar, agarose, and laminarin, as well as protein, starch, pullulan, and alginic acid. In addition to degrading this plethora of polymers, M. degradans strain 2-40 can utilize each of the polysaccharides as the sole carbon source. Therefore, M. degradans strain 2-40 is not only an excellent model of microbial degradation of insoluble complex polysaccharides (ICPs) but can also be used as a paradigm for complete metabolism of these ICPs. ICPs are polymerized saccharides that are used for form and structure in animals and plants. They are insoluble in water and therefore are difficult to break down.

Chitin is a difficult substrate for microbial degradation because it is usually crystalline and complexed with protein, salts, and other carbohydrates. Chitin is resistant to chemical degradation and is difficult to digest enzymatically because of the multiple steps required to expose and cleave the polymer. Because chitin resists chemical and physical breakdown, microorganisms must play a major role in its degradation. Many microorganisms have developed efficient strategies for the depolymerization, transport, and metabolism of chitin and its derivatives. These systems involve multiple enzyme activities, usually encoded on separate polypeptides. For example, Pseudoalteromonas strain S91, Serratia marcescens, and Streptomyces coelicolor secrete several chitin-depolymerizing enzymes in the presence of chitin. Surprisingly, almost no free chitin is found in marine sediments, demonstrating the efficiency of these microbial systems. Therefore, chitin represents an abundant source of carbon and nitrogen to microorganisms in the marine environment.

The glycoside hydrolase family 18 (GH18) domain is the most common catalytic domain of microbial chitin depolymerases. Despite sharing a consensus sequence and a conserved catalytic glutamic acid residue, GH18 domains differ in their activity toward polymeric chitin and chito-oligosaccharides (i.e., endo- versus exo-activity). Chitodextrinases, which depolymerize chitooligosaccharides but not chitin, also contain GH18 domains. Chitinolytic enzymes with GH18 domains have been isolated from organisms as diverse as psychrophilic eubacteria and hyperthermophilic archaeons, demonstrating the wide range of conditions to which these domains have adapted. Because conserved residues are found in GH18 domains with divergent optima and substrate specificities, sequence analysis is insufficient to determine the enzymatic specificities of newly discovered chitinases.

Endo- and exo-chitinases that function cooperatively to depolymerize chitin are known. Endochitinases randomly cleave glycosidic linkages, generating free ends and long chitooligosaccharides. These are then acted upon by exochitinases that release chitobiose from the non-reducing ends of each. While exo- and endo-chitinases are not able to depolymerize chitin alone, the presence of both activities significantly increases the efficiency of chitinolytic systems.

Therefore, there exists a need to identify enzyme systems that use chitin as a substrate, express the genes encoding the proteins using suitable vectors, identify and isolate the amino acid products (enzymes and non-enzymatic products), and use these products as well as organisms containing these genes to degrade plant and animal waste.

SUMMARY OF THE INVENTION

One aspect of the present invention is directed to systems that degrade plant and animal waste.

A further aspect of the invention is directed to a method for the degradation of substances comprising insoluble complex polysaccharides. The method involves breaking at least one bond between glucosamine units in chitooligosaccharides by applying a composition comprising at least one polypeptide that binds to the chitooligosaccharides.

Another aspect of the present invention is directed to groups of enzymes that catalyze reactions involving chitin or chitooligosaccharides.

Another aspect of the present invention is directed to polynucleotides that encoding polypeptides with chitin depolymerase activity, chitodextrinase activity, N-acetyl-D-glucosaminidase activity, or chitin binding activity.

A further aspect of the invention is directed to chimeric genes and vectors comprising genes that encode polypeptides with chitin depolymerase activity, chitodextrinase activity, N-acetyl-D-glucosaminidase activity, or chitin binding activity.

A further aspect of the invention is directed to polypeptides comprising at least two domains, in which the domains are separated by a poly-amino acid linker.

Another aspect of the invention is directed to the treatment of asthma by the application of a composition comprising at least one of SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:17, or SEQ ID NO:19.

A further aspect of the invention is directed to a method for the identification of a nucleotide sequence encoding a polypeptide comprising any one of the following activities from M. degradans: chitin depolymerase, chitodextrinase, N-acetyl-D-glucosaminidase, or chitin binding. An M. degradans genomic library is constructed in E. coli and screened for the desired activity. Transformed E. coli cells with specific activity are created and isolated.

Other aspects, features, and advantages of the invention will become apparent from the following detailed description, which when taken in conjunction with the accompanying figures, which are part of this disclosure, and which illustrate by way of example the principles of this invention.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 shows poly-amino acid linkers, glutamic acid-rich domains, and conserved modules found in several polypeptides from M. degradans;

FIG. 2 shows a model for the chitin depolymerization of ChiB; and

FIG. 3 shows a polypeptide construct comprising a secretion signal, a lipoprotein box, a poly-amino acid linker, a multiple cloning site, and a protein of interest.

DETAILED DESCRIPTION

The degradation and metabolism of chitin by marine microorganisms appear to involve the synergistic action of multiple proteins, including several extracellular chitin depolymerases, non-catalytic chitin-bonding proteins, chitodextrinases, and periplasmic and cytoplasmic N-acetylglucosaminidases (chitobiases and N-acetylhexosaminidases). These proteins typically include conserved modules that function as catalytic domains or chitin-binding domains and also contain domains of unknown function such as fibronectin type III and/or polycystic kidney disease (PKD) domains. Many of the genes for these enzymes have been cloned individually from chitin-degrading organisms.

M. degradans is unique among marine bacteria in its ability to degrade more than 10 ICPs. The draft genome sequence reveals over 130 putative carbohydrases involved in the degradation of these ICPs. Forty-six of these proteins contain poly-amino acid linkers, which are generally limited to secreted enzymes involved in ICP degradation. The majority of the amino acids in these linkers are serines. This finding strongly suggests the importance of poly-amino acid linker motifs in carbohydrate catalysis in nature.

M. degradans strain 2-40 efficiently metabolizes chitin, among many other ICPs. M. degradans strain 2-40 degrades and metabolizes chitin by expression of extracellular, periplasmic, and cytoplasmic systems for the depolymerization, transport, and metabolism of chitin-derived products. The chitinolytic system of M. degradans strain 2-40 includes at least three chitin depolymerases (ChiA, ChiB, and ChiC), a non-catalytic chitin-binding protein (CbpA), a chitodextrinase (CdxA), and three N-acetylglucosaminidases (HexA, HexB, and HexC). The proteins of this system contain domains similar to catalytic and binding regions of other microbial chitinases and in some cases polyserine- and hydroxyl-amino acid-rich linkers of unknown function.

Chitin depolymerases are enzymes that causes the cleavage of the β-1,4-linkage between N-acetyl-D-glucosamine units. Chitin binding proteins are proteins that can bind to chitin. Chitodextrinases are proteins that are able to degrade soluble chitooligosaccharides but not polymeric chitin. N-acetylglucosaminidases are proteins that are able to cleave chitobiose to form two GlcNAcs.

FIG. 1 shows various domains in ChiA, ChiB, ChiC, CbpA, CdxA, HexA, HexB, and HexC, which were isolated from M. degradans and expressed in E. coli according to the procedures described in Howard et al. (J. Bacteriol. 185(11), 3352 -3360, 2003), the contents of which are herein incorporated in their entirety by reference. Protein sequence analysis of these proteins revealed the presence of putative type II secretion signals (black boxes); GH18, GH20, and GH3 domains; poly-amino acid domains (dotted boxes), chitin-binding domains (cross-hatched boxes); glutamine acid-rich domains (grey boxes); carbohydrate-binding domains; PKD domains (boxes with horizontal lines); soluble sugar-binding domains (hatched boxes); and conserved modules found in other microbial chitinases. FIG. 1 also shows the two domains of ChiB (black bars). GH18_(N) is located between amino acids 221 and 605 of ChiB. GH18_(C) is located between amino acids 860 and 1254.

The chiA (SEQ ID NO: 1) gene has 1632 base-pairs. The chiB (SEQ ID NO: 2) gene has 3186 base-pairs. The chiC (SEQ ID NO: 3) gene has 2379 base-pairs. The cbpA (SEQ ID NO: 4) gene has 1347 base-pairs. The cdxA (SEQ ID NO: 5) gene has 3441 base-pairs. The hexA (SEQ ID NO: 6) gene has 2388 base-pairs. The hexB (SEQ ID NO: 7) gene has 2670 base-pairs. The hexC (SEQ ID NO: 8) gene has 1035 base-pairs.

ChiA (SEQ ID NO: 9) is a 543-amino-acid protein with a calculated mass of 57.0 kDa. ChiA comprises two Cbd3 motifs and a GH18 domain. The first Cbd3 consisted of 46 residues and was most similar to a Cbd3 of ChiA from Pseudoalteromonas sp. strain S91. The sequence of the second 47-amino-acid domain was similar to the Cbd3 sequence from ChiA of Vibrio cholerae. The 299-amino-acid GH18 domain exhibited the highest identity with the GH18 domain of ChiA from V. cholerae. ChiA is composed of two amino-terminal chitin binding domains separated by a poly-amino acid linker. The second binding domain is followed by an additional poly-amino acid linker and a GH18 catalytic domain.

ChiB (SEQ ID NO: 10) is a modular, 1,271-amino-acid enzyme with a calculated molecular mass of 136.1 kDa. The amino terminus is predicted to contain a secretion signal that is separated from the remainder of the protein by a poly-amino acid linker of 148 amino acids, 99 of which are serine residues. ChiB includes two complete GH18 domains—an amino-terminal domain GH18_(N) and a carboxy-terminal domain GH18_(C)—separated by a 180-amino-acid linker domain which includes an acidic region consisting of TE-(ET)₁₀ and another poly-amino acid linker containing 39 serine residues. Both GH18 domains of ChiB are catalytically active but differentially cleave glycosidic linkages, depending on their location within the chitin polymer. In addition, chitin depolymerization is enhanced by the presence of both domains.

One of the catalytic domains of ChiB functions as an endochitinase while the other functions as an exochitinase, as shown in FIG. 2. ChiB is the first eubacterial chitinase demonstrated to contain two functional GH18 catalytic domains. The lack of carbohydrate binding domains and typical accessory domains (e.g., Fibronectin Type III domains, PKD domains) coupled with the discrete activities of each catalytic domain emphasize the novelty of this enzyme.

When expressed as separate polypeptides, each GH18 domain of ChiB was able to depolymerize chitin in zymograms and was most active under similar temperature, pH, and ionic conditions. A detailed description of the procedures used for chitin degradation analysis can be found in Howard et al. (J. Bacteriol. 186(5), 1297 -1303, 2004), the contents of which are herein incorporated in their entirety by reference.

GH18_(N) (SEQ ID NO: 17) is a 485 amino-acid protein (between amino acids 221 and 605 of ChiB), encoded by SEQ ID NO: 18 (1455 base-pairs; base-pairs 468 to 1924 of chiB). GH18_(N) is more active on MUF-diNAG than MUF-triNAG and displayed a pattern of activity typical of an exo-chitinase on chitooligosaccharides. Chitobiose was released from the non-reducing end of GlcNAc₄-GlcNAc₆.

GH18_(C) (SEQ ID NO: 19) is a 429 amino-acid protein (between amino acids 860 and 1254 of ChiB), encoded by SEQ ID NO: 20 (287 base-pairs; base-pairs 2512-3800 of chiB). GH18_(C) releases MUF most rapidly from MUF-triNAG and is able to cleave chitooligosaccharides at multiple linkages, demonstrating endo-chitinase activity. GH18_(C) is more than twice as active on native chitin as GH18_(N). This is likely because native chitin has a paucity of free, exposed, ends. Therefore, exochitinases have far fewer sites at which they can act as compared to random cutting endochitinases that can cleave virtually any glycosidic linkage in the polymer.

The synergistic degradation of chitin observed when both domains were present further supports their proposed function. The presence of both domains on separate polypeptides increased the release of reducing sugars 140% over the theoretical combined rate calculated if the domains were only to act additively. This synergism would not be observed if both domains had the same activity.

Carbohydrases with two catalytic domains are rare among prokaryotes. Only a small number have been characterized, mostly from ruminants and thermophiles. For example, Ruminococcus flavefaciens 17 and Fibrobacter succinogenes S85, produce xylanases with two catalytic domains, though the latter appears to encode a xylanase with two domains of the same function. Two extreme thermophiles, Anaerocellum thermophilum (a γ-subgroup proteobacterium) and Thermococcus kodakaraensis KOD1 (an archeon), produce enzymes with two catalytic domains. A. thermophilum produces a cellulase with separate GH9 and GH48 domains that encode for endo- and exo-glucanase activity, respectively. A chitinase from T. kodakaraensis, Tk-ChiA, was shown to have an amino-terminal exochitinase domain, while the carboxy-terminus contains an endochitinase domain. Unlike ChiB of M. degradans, this enzyme also contains chitin-binding domains and is not predicted to anchor to the cell surface. Further, the exolytic domain of Tk-ChiA is able to weakly cleave the third glycosidic linkage from the non-reducing end of free chitin chains, an activity not observed in experiments with GH18_(N).

The dual catalytic domains of ChiB function cooperatively to degrade chitin to chitobiose. Though maximal depolymerization was achieved when the catalytic domains of ChiB were on separate polypeptides, there are benefits to their presence as a single unit.

First, a single promoter region is able to regulate the expression of two enzymatic activities. This permits two essential components of the chitinolytic system to be simultaneously regulated from a single locus, much like an operon regulating genes encoding a polycistronic mRNA. However, unlike an operon where several individual proteins are produced, a single enzyme is encoded. The amount of energy and secretion machinery needed to deliver two enzymatic functions to the exterior of the cell is therefore decreased.

Second, encoding both activities on a single polypeptide ensures the proximity of the two domains during the in situ depolymerization of chitin. This allows for a synergistic and focused degradation of the polymer. In the environment, secreted enzymes diffuse away from their intended targets and not be available to assist other components of a degradative system. This is partially solved by the presence of carbohydrate binding domains (which appear to be lacking from ChiB), but there is no assurance that both endo- and exo-acting enzymes will bind to the same location and have the opportunity to act in concert to achieve the full potential of the system unless linked on a single polypeptide.

When both domains were present on the same polypeptide, the synergism between the domains was less apparent. The activity detected when the domains are joined was only a modest increase over the theoretical activity when compared to the activity of the two catalytic domains as separate entities. The decreased activity of the domains when linked may be the result of the domains then moving as a single protein as each encounters substrate. For example, as the exolytic domain is cleaving soluble chitooligosaccharides away from the insoluble polymer, the endolytic domain is unable to contact, and therefore degrade, its primary substrate. The amount of reducing sugars released would increase if the domains were free to act at different locations. Such an arrangement is of less benefit in nature where substrates are much more limited and less often encountered than in laboratory reactions.

A model of ChiB activity is shown in FIG. 2. ChiB likely attaches to a surface of a cell via a lipoprotein anchor (cross-hatched box). Activity of the endochitinolytic GH18_(C) (oval) releases chitooligosaccharides from polymeric chitin (hatched box). Free chitooligosaccharides (small circles) are then acted upon by the exochitinolytic GH18_(N) (oval) that processively releases chitobiose from the non-reducing end. Free chitobiose would then be taken up by the cell and metabolized. The poly-amino acid linkers (black “S”-shapes) may provide flexibility to the enzyme and optimize interaction with substrates.

Each catalytic site has been shown to be independently active, so the linkage between the domains prevents interference between them during the degradation of chitin. The processive cutting nature of exochitinases and random cutting behavior of endochitinases is applied to the activity model of ChiB. As GH18_(C) releases chitooligosaccharides from the polymer, they can be immediately acted upon by GH18_(N) which processively cleaves chitobiose from the non-reducing end. The lipoprotein acylation site present at the amino terminus of ChiB likely functions to anchor the enzyme to the outer membrane. This notion is strengthened by the observation that chitinase activity has been associated with outer membrane preparations of M. degradans. The membrane anchorage keeps two critical enzymatic activities in close proximity to the cell and forgoes the necessity of chitin-binding domains. The catalytic domain arrangement within ChiB allows chitooligosaccharides released by the activity of the distal GH18_(C) to be transferred to the exo-acting domain, which is in close proximity to the outer membrane where newly formed chitobiose can be taken up by the cell. ChiB is found in crude membrane preparations of M. degradans.

ChiC (SEQ ID NO: 11) is a 792-amino-acid polypeptide with a calculated molecular mass of 87.1 kDa. ChiC had two Cbd3 domains. The first, a 46-amino-acid domain, is most similar to the Cbd3 of ChiB from Vibrio harveyi. The second, consisting of 49 amino acids, is most similar to the Cbd3 of ChiA from V. cholerae. ChiC also contained three PKD-like domains. ChiC had a 350-amino-acid C-terminal GH18 catalytic domain with strong similarity to ChiC from Streptomyces peucetius.

CbpA (SEQ ID NO: 12) is a 449-amino-acid polypeptide consisting of two carbohydrate binding domains but with no apparent catalytic domain. The first chitin-binding domain consisted of 220 amino acids and was most similar to the chitin-binding module of P. aeruginosa CbpD. The second was a 95-amino-acid type 2 carbohydrate-binding module with similarity to the CBM2 of a rhamnogalacturonan lyase from Celivibrio japonicus (formerly Pseudomonas cellulosa). Similar chitin-binding proteins have been reported in a number of marine microorganisms, though their role in chitin degradation is poorly understood. It has been hypothesized that chitin-binding proteins keep a bacterium in close proximity to the chitin polymer to facilitate efficient degradation, though there is no direct evidence that these proteins bind both the cell and chitin simultaneously. A Glu-Pro-rich domain consisting of (Glu-Pro)₇ is located between the carbohydrate-binding modules of CbpA.

CdxA (SEQ ID NO: 13) is a 1,088-amino-acid polypeptide, with a calculated molecular mass of 115.6 kDa, and comprises a typical type II-dependent secretion signal, two PKD domains, a 403-amino-acid GH18 catalytic site, and a 41-amino-acid Cbd3 chitin-binding domain. The GH18 domain is most similar to that of chitodextrinase ChiD from Alteromonas sp. strain O-7. The Cbd3 domain was most similar to the Cbd3 in Pseudoalteromonas sp. strain S91 ChiA.

HexA (SEQ ID NO: 14) is a 795-amino-acid polypeptide with a predicted molecular mass of 88.5 kDa. HexA carried a GH20b domain (glycosyl hydrolase family 20 catalytic domain 2) that is most similar to the active site of the Alteromonas sp. strain O-7 N-acetylhexosaminidase and a 348-aa GH20 domain related to the active site of the Pseudoalteromonas sp. strain S91 N-acetylglucosaminidase. HexA has an N-terminal type II-dependent secretion signal and may be a surface-anchored lipoprotein like ChiB.

HexB (SEQ ID NO: 15) is an 889-amino-acid polypeptide with a predicted mass of 98.4 kDa that contains a putative carbohydrate-binding domain, a GH20b domain found in the N-acetylhexosaminidase B of Alteromonas sp. strain O-7, and a 406-amino-acid GH20 domain identified as the active site of the N-acetylhexosaminidase of Vibrio vulnificus. HexB also contains an N-acetylhexosaminidase-like C-terminal domain related to the N-acetyl-D-glucosaminidase from Enterobacter sp. strain G-1. HexB also has an N-terminal type II-dependent secretion signal. The overall similarity of HexA and HexB to other N-acetylglucosaminidases and retention of key catalytic domains are consistent with their proposed activity.

HexC (SEQ ID NO: 16) is a 345-amino-acid polypeptide with a predicted mass of 37.4 kDa that lacks an apparent N-terminal secretion signal. Hex C has a GH3N domain (glycosyl hydrolase family 3 N-terminal domain) similar to that of Pseudomonas aeruginosa N-acetylglucosaminidase. HexC likely degrades cytoplasmic chitobiose. This activity could have a role in the regulation of genes activated by the presence of chitobiose and would also release GlcNAc for use as an energy source.

It is one aspect of the present invention to provide a nucleotide sequence that has a homology selected from 100%, 99%, 98%, 97%, 96%, 95%, 90%, 85%, 80%, or 75% to any one of SEQ ID NO:1, SEQ ID NO:2, or SEQ ID NO:3, having chitin depolymerase activity; SEQ ID NO:4, having chitin-binding protein activity; SEQ ID NO:5, having chitodextrinase activity, and any one of SEQ ID NO:6, SEQ ID NO:7, or SEQ ID NO:8, having N-acetylglucosaminidase activity. The present invention also covers replacement of between 1 and 20 nucleotides of any of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, or SEQ ID NO:8 with non-natural or non-standard nucleotides for example phosphorothioate, deoxyinosine, deoxyuridine, isocytosine, isoguanosine, ribonucleic acids including 2-O-methyl, and replacement of the phosphodiester backbone with, for example, alkyl chains, aryl groups, and protein nucleic acid (PNA).

It is another aspect of the present invention to provide a nucleotide sequence that hybridizes to any one of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, or SEQ ID NO:8 under a stringency condition of 1×SSC. It is another aspect of the present invention to provide a nucleotide sequence that hybridizes to any one of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, or SEQ ID NO:8 under a stringency condition of 2×SSC. It is another aspect of the present invention to provide a nucleotide sequence that hybridizes to any one of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, or SEQ ID NO:8 under a stringency condition of 3×SSC. It is another aspect of the present invention to provide a nucleotide sequence that hybridizes to any one of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, or SEQ ID NO:8 under a stringency condition of 4×SSC. It is another aspect of the present invention to provide a nucleotide sequence that hybridizes to any one of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, or SEQ ID NO:8 under a stringency condition of 5×SSC. It is another aspect of the present invention to provide a nucleotide sequence that hybridizes to any one of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, or SEQ ID NO:8 under a stringency condition of 6×SSC. It is another aspect of the present invention to provide a nucleotide sequence that hybridizes to any one of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, or SEQ ID NO:8 under a stringency condition of 7×SSC. It is another aspect of the present invention to provide a nucleotide sequence that hybridizes to any one of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, or SEQ ID NO:8 under a stringency condition of 8×SSC. It is another aspect of the present invention to provide a nucleotide sequence that hybridizes to any one of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, or SEQ ID NO:8 under a stringency condition of 9×SSC. It is another aspect of the present invention to provide a nucleotide sequence that hybridizes to any one of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, or SEQ ID NO:8 under a stringency condition of 10×SSC.

It is another aspect of the present invention to provide a nucleotide sequence that encodes a polypeptide having chitin depolymerase activity. It is yet another aspect of the present invention to provide a nucleotide sequence that encodes a polypeptide having chitin-binding ability. It is a further aspect of the present invention to provide a nucleotide sequence that encodes a polypeptide having N-acetylglucosaminidase activity. It is well understood that due to the degeneracy of the genetic code, an amino acid can be coded for by more than one codon. Therefore, the present invention encompasses all polynucleotides that code for any one of SEQ ID NO: 9; SEQ ID NO: 10; SEQ ID NO: 11; SEQ ID NO: 12; SEQ ID NO: 13; SEQ ID NO: 14; SEQ ID NO: 15; or SEQ ID NO: 16.

The scope of this invention covers natural and non-natural alleles of any one of SEQ ID NO: 9; SEQ ID NO: 10; SEQ ID NO: 11; SEQ ID NO: 12; SEQ ID NO: 13; SEQ ID NO: 14; SEQ ID NO: 15; or SEQ ID NO: 16. In a preferred embodiment of the present invention, alleles of any one of SEQ ID NO: 9; SEQ ID NO: 10; SEQ ID NO: 11; SEQ ID NO: 12; SEQ ID NO: 13; SEQ ID NO: 14; SEQ ID NO: 15; or SEQ ID NO: 16 comprise replacement of one, two, three, four, or five naturally occurring amino acids with similarly charged, shaped, sized, or situated amino acids (conservative substitutions). The present invention also covers non-natural or non-standard amino acids for example selenocysteine, pyrrolysine, 4-hydroxyproline, 5-hydroxylysine, phosphoserine, phosphotyrosine, and the D-isomers of the 20 standard amino acids.

Chitin degrading enzyme systems, including one or more enzymes or chitin-binding proteins, have a number of uses. In one embodiment, these systems can be used to degrade chitin to produce short chain chitooligosaccharides for use in medicine. These are therapeutic for those suffering from asthma, as indicated in Zhu et al. (Science 304, 1678-1679, 2004), the contents of which are incorporated herein, in their entirety, by reference.

Longer chain oligosaccharides have been shown to kill cancer cells and reduce blood pressure. In another embodiment, these systems are used to break down chitin into short chain sugars or beginning to break down crab shell waste to a form that other bacteria can use. This is a valuable feedstock for bioreactors or fertilizers. In another embodiment, these systems are used to de-glycosylate proteins in plants and animals that are involved in disease. Removal of such glycoslyation could be a therapy for crops or animals. The protein of interest can further be glycosylated with the appropriate sugars.

Chitin and chitosan can be used to absorb environmental pollutants and waste spills. The chitin could then be degraded by the chitin degrading systems of the present invention. Bacteria that can metabolize environmental pollutants and can degrade chitin could be used in bioreactors that degrade toxic materials. Such a bioreactor would be advantageous since there would be no need to add additional nutrients to maintain the bacteria—they would use chitin as a carbon source. Bacteria engineered to express the chitin degradative systems and metabolize environmental pollutants are one preferred embodiment of the present invention.

Chitin degrading enzyme systems can be supplied in dry form, in buffers, as pastes, paints, micelles, etc. Chitin degrading enzyme systems can also comprising additional components such as metal ions, chelators, detergents, organic ions, inorganic ions, additional proteins such as biotin and albumin.

Other embodiments of the present invention involve strategic placement of correctly folded proteins on the surface of a bacterial cell and the separation of catalytic domains in an enzyme. The genome sequence of this M. degradans revealed 46 proteins with large poly-amino acid domains. For example, ChiA and ChiB include long polyserine domains that appear to separate functional groups/catalytic domains.

Of the M. degradans genes identified that encoded proteins with poly-amino acid linkers, 18 contained a single poly-amino acid linker, while 28 had two or more. These poly-amino acid linkers have an average length of 39 residues and an average composition of 79% serine, 11% glycine, 7% threonine, and 3% alanine. Glycine residues are predominantly found immediately flanking tracts of polyserine sequence, and more than 80% of the poly-amino acid linkers have glycine residues at their start or terminus. Several of the poly-amino acid linkers also contain a single aspartic acid or cysteine residue. Though serine is a predominant residue within each poly-amino acid linker, none were identical in terms of exact residue composition or sequence. Each of the six codons for serine is used to encode serines within the poly-amino acid linkers. None of these codons is used preferentially, nor were any of them arranged in any obvious pattern or repeat.

Poly-amino acid linkers containing proteins were identified using protein sequences based upon the translated nucleotide sequences of 140 completed microbial genomes and, where possible, the 125 unfinished microbial genomes found at the NCBI microbial genome homepage (http://www.ncbi.nlm.nih.gov/genomes/MICROBES/Complete.html). Non-redundant, annotated protein sequence databases were searched for poly-amino acid linker proteins using the PIR pattern/peptide match program at the Protein Information Resource server (http://pir.georgetown.edu/). The domain architecture of each poly-amino acid linker protein was analyzed using the Simple Modular Architecture Research Tool (http://smart.embl-heidelberg.de). Type II secretion signals were identified using the iPSORT program (http://www.hypothesiscreator.net/iPSORT) and lipoprotein acylation sites were identified at the DOLOP website (http://www.mrc-lmb.cam.ac.uk/genomes/dolop).

All of the 46 M. degradans poly-amino acid linker proteins are carbohydrate depolymerizing enzymes, carbohydrate binding proteins, or proteins with similarity to known proteins involved in carbohydrate degradation. These include 2 chitinases, 8 cellulases, 10 pectate lyases, 5 xylanases, 3 mannanases, a rhamnogalacturonan lyase, an alginate lyase, and 16 proteins of unknown function. Among the 16 proteins for which no activity could be predicted, each has weak similarity to a known degradative enzyme or contains sequence similarity to known carbohydrate binding module [CBM] or catalytic domain. In cases where no sequence similarity was identified, the poly-amino acid linkers separated the proteins into segments large enough to contain presently unconfirmed catalytic sites or CBMs. Each of the 46 poly-amino acid linkers containing proteins contains a Type II secretion signal.

In M. degradans, poly-amino acid linkers separate predicted binding and/or catalytic domains. In nine proteins, a poly-amino acid linker immediately follows the secretion signal. All nine of these proteins contain an apparent lipoprotein acylation site, i.e. each has at least one positively charged residue within the first five amino acids, a hydrophobic stretch of 8 to 10 residues, and a lipobox containing the appropriately conserved amino acids, including a cysteine residue. In gram-negative bacteria, when the cysteine residue within a lipobox is acylated, the protein becomes anchored to the inner or outer membrane. In the present invention, poly-amino acid linkers separate anchoring domains from the remainder of a protein.

Forty-two of the 46 genes encoding poly-amino acid linker proteins are unique within the M. degradans genome sequence. The remaining four genes include two pairs of paralogs. The genes for two predicted pectate lyases (ZP_(—)00067834 and ZP_(—)00067832) exhibit greater than 75% identity among a carbohydrate binding domain and a Fibronectin Type III domain, and more than 80% identity between sequences corresponding to catalytic domains. The nucleotide sequence corresponding to the similarly located poly-amino acid linkers is less than 20% identical. Likewise, two cellulases (ZP_(—)00066178 and ZP_(—)00068260) also appear to have significant similarity at the nucleotide level except for their poly-amino acid linkers. In C. japonicus, the genes for XylB and XylC are located in tandem in the genome and contain duplicate sequence at their amino-termini, which includes a poly-amino acid linker. Duplicated genes wherein one of the genes encoded a poly-amino acid linker and the other did not were not identified in either organism. Thus, it does not appear that a known method of transposition or a recent, repetitive duplication event generated poly-amino acid linkers.

Interestingly, eight of the M. degradans poly-amino acid linker proteins are most similar to C. japonicus enzymes wherein sequence, overall domain architecture, and poly-amino acid linker location are conserved. Horizontal transfer is known to play a role in the acquisition of new genetic material by bacteria, though it often occurs in specific eco-niches, such as the rumen. It is unlikely that C. japonicus, a soil bacterium, and M. degradans, a marine bacterium, have recently shared a common environment. Thus, these genes may have been exchanged before each evolved to different habitats or may have been inherited from a common ancestor. In either case, these domain arrangements have been conserved for an evolutionarily long period of time, suggesting that the placement of the domains and poly-amino acid linkers within each enzyme is functionally significant.

Beyond the poly-amino acid linker proteins of M. degradans and C. japonicus, 17 poly-amino acid linker proteins were identified during searches of the non-redundant database as well as complete and incomplete microbial genome sequences. No proteins with poly-amino acid linkers were identified among archeae. Cellulose degrading enzymes with poly-amino acid linkers were identified in Pseudomonas sp. ND137, Xyella fastidiosa strain Temecula1, Xyella fastidiosa strain 9a5c, and Ruminococcus albus. Erwinia chrysanthemi encodes OutD, a pectic enzyme secretion protein, that contains a poly-amino acid linkers. These species, however, do not encode more than one protein with a poly-amino acid linker.

There are several observations that suggest poly-amino acid linkers are flexible. First, using the NORSp program, poly-amino acid linkers are not predicted to have a regular secondary structure, but are instead extended, ‘loopy’ regions. Secondly, lipovitellin, a eukaryotic protein that contains a poly-amino acid (polyserine) region, was partially crystallized. The poly-amino acid linker region was, however, not included in the crystal structure. This is consistent with the notion that disordered regions are not amenable to crystallization. Finally, glycine residues flank >80% of the poly-amino acid linkers in M. degradans proteins. These residues may increase the overall flexibility of these regions, as the flexibility of glycine is well documented. Taken together, these factors suggest that poly-amino acid linkers are disorganized, flexible spacers.

During the degradation of ICPs, flexible linker regions coupling a catalytic and a binding domain could expand the potential substrate target area available to the enzyme after a CBM makes contact with a polymer. Similarly, poly-amino acid linkers could enhance substrate availability to an enzyme anchored to a bacterial outer membrane, a potential survival advantage in the marine environment where diffusion and dilution are major factors affecting extracellular enzymes. In nine M. degradans enzymes and in several hypothetical proteins from other organisms, poly-amino acid linkers are located immediately after an amino-terminal lipobox, suggesting that poly-amino acid linkers can function to extend the catalytic and/or binding domains of a surface associated enzyme from the outer membrane.

Based upon thorough searches of existing prokaryotic genome databases, the known enzymes of C. japonicus, searches of the non-redundant database, and the considerable data afforded by analysis of the M. degradans genome, it is likely that in prokaryotes, poly-amino acid linkers are generally found within secreted, complex polysaccharide depolymerizing enzymes or proteins involved in carbohydrate binding or metabolism in order to assist in interaction with substrates.

While M. degradans encodes 46 proteins with poly-amino acid linkers involved in complex carbohydrate degradation, it likely contains nearly twice that number of extracellular carbohydrases wherein the domains are not separated by repetitive linking sequence. Similarly, C. japonicus also encodes carbohydrases that do not contain poly-amino acid linkers. The deletion of poly-amino acid linkers from two C. japonicus xylanases decreased their activity on insoluble substrates, but does not altogether abolish their activity or reduce binding. Furthermore, threonine/proline rich linkers have been shown to be dispensable with only moderate loss of activity. These observations indicate that while poly-amino acids may not be required for carbohydrase function, they may have evolved to enhance the activity of certain enzyme configurations, particularly during in situ degradation of ICPs. Though poly-amino acid linker coding sequences are dynamic, their amino acid sequences are static, suggesting specific structural constraints associated with advantageous function.

Poly-amino acids also appear to function as linker regions between functional domains within enzymes and separate binding and catalytic domains. The average length of poly-amino acid linkers is 39 residues. They are composed mostly of serine (74%), but also contain alanine, threonine, and glycine. Another proposed function of these linkers is to provide additional space between functional modules, perhaps to allow for proper folding of the peptide and to allow a larger area to be accessed by the enzyme after it has bound a substrate.

In several of these proteins, the poly-amino acid linker separates catalytic or binding domains from an amino-terminal lipoprotein box. A lipoprotein box is likely used by γ-subgroup proteobacteria to anchor enzymes to the cell surface via an acylation of an internal cysteine. Being able to separate the lipoprotein box from the catalytic portions of a protein presents several advantages. First, functional domains are not in close proximity to the cell surface, which may interfere with protein folding and function, thus in one embodiment of the invention, the poly-amino acid linkers provide a mechanism to tether correctly folded proteins to the cell surface. Second, in another embodiment, the poly-amino acid linkers ensure that the catalytic portions of the protein are exposed to the extracellular environment and not trapped in the periplasm or outer membrane. In a third embodiment, the poly-amino acid linkers expand the length of the protein so that it can ‘reach’ further into the environment to contact substrates.

The lipid-anchored proteins of M. degradans with poly-amino acid linker domains have most likely evolved to function well on the outer membrane of a Gram-negative bacterial. Functional proteins in E. coli as a lipoprotein anchors, are excellent tools for arraying any known protein of interest on the surface of an E. coli cell, while allowing the protein of interest to retain a native (and active) conformation.

The poly-amino acid linker domains described here have been observed in at least two other bacteria: Cellvibrio cellulosa and Terididobacter spp. In these organisms, the poly-amino acid linker domains are not observed at the extreme amino-terminus, nor are they found in predicted lipoproteins. The amount of additional space that would exist between the cell and any arrayed enzyme by virtue of the poly-amino acid linker domain suggests that a protein expressed with this amino-terminal motif assumes a native conformation once on the cell surface.

In one embodiment of the present invention, this type of amino-terminal modification is incorporated into the construction of a plasmid vector that can be used to create fusion proteins with peptides of interest. This type of vector could have significant use in the fields of bioengineering and proteomics. In another embodiment of the present invention, this vector would allow proteins to be presented and anchored to the surface of the cell. This would allow waste to be modified by presenting an enzyme on the surface of the cell and growing the waste material in culture with the E. coli expression strain. By centrifuging the reaction, modified (and possibly valuable) products can be collected that would be substantially free of both cells and enzyme. This is of particular interest to the bioprocessing and bioremediation fields. This system could be used to display epitopes on the surface of any Gram-negative bacterium for vaccine development.

FIG. 3 shows another embodiment of the present invention. The hatched box represents the secretion signal and lipoprotein box in a polypeptide construct. In one embodiment of the present invention, a conserved cysteine (C*) is found within the lipoprotein box. This conserved cysteine is acylated by proteins in a host cell, for example Lol proteins in E. coli, thereby anchoring the construct to an outer membrane of the host cell. The poly-amino acid linker (dotted box) can begin between preferably between 1 and 30 amino acids, more preferably between 3 and 25 amino acids, and most preferably between 5 and 15 amino acids after the conserved cysteine. In one aspect of the present invention, a multiple cloning site (MCS) is inserted after the poly-amino acid linker. The MCS can be inserted 1 to 100 amino acids, more preferably between 25 and 75 amino acids, and most preferably between 30 and 50 amino acids after the poly-amino acid linker. A protein of interest can be ligated in frame with any one of the secretion signal, lipoprotein box, or poly-amino acid linker, which would allow the protein to be anchored to the outer membrane in its native confirmation. This protein can then be cleaved off the membrane and isolated.

One aspect of the present invention comprises an isolated polypeptide, which further comprises at least two domains. These domains is any one of catalytic domains, binding domains, trans-membrane domains, surface anchoring domains and lipoprotein acylation sites. One domain of the isolated polynucleotide is separated from another domain by a poly-amino acid linker, wherein at least 95% of the amino acids in the poly-amino acid linker are serines. In another aspect of the present invention, at least 90% of the amino acids in the poly-amino acid linker are serines. In another aspect of the present invention, at least 85% of the amino acids in the poly-amino acid linker are serines. In another aspect of the present invention, at least 80% of the amino acids in the poly-amino acid linker are serines. In another aspect of the present invention, at least 75% of the amino acids in the poly-amino acid linker are serines. In another aspect of the present invention, at least 70% of the amino acids in the poly-amino acid linker are serines.

Non-limiting examples of experimental methods used in the present invention are described.

Growth of bacterial strains. M. degradans strain 2-40 was grown in minimal medium containing (per liter): 2.3% Instant Ocean, 0.5% ammonium chloride, 0.2% glucose, and 50 mM Tris HCl, pH 7.6. Other carbon sources were added to a final concentration of 0.1%. Agar was added to a final concentration of 1.5% to prepare solid media. All cultures were incubated at 25° C. E. coli EC300, DH5αE, and Tuner strains were grown in Luria-Bertani (LB) broth or agar supplemented with the appropriate antibiotics and incubated at 37° C.

Construction of an M. degradans strain 2-40 genomic library. Strain 2-40 chromosomal DNA was isolated and prepared for ligation into pCC1. Sau3A fragments of 30 to 40 kb were isolated using gel extraction and ligated into Bam H1-digested pCC1. The vector was packaged into phage and used to infect E. coli EC300. Transductants were selected using chloramphenicol (30 μg/mL).

Screening of the M. degradans strain 2-40 genomic library for chitin depolymerase activity. E. coli transductants were initially screened for chitin depolymerase activity by plating the library on LB agar supplemented with 0.1% chitin or 0.08% chitin azure and incubating for 5 days at 37° C. Chitin depolymerase activity was identified by zones of clearing around bacterial colonies. Alternatively, the chitin analogs 4-methylumbelliferyl-β-D-N,N′-diacetylchitobioside (MUF-diNAG) and 4-methylumbelliferyl-β-D-N,N′,N″-triacetylchitotrioside (MUF-triNAG) were used to screen transductants for chitinase activity. Single transductants were grown in 100 μL of LB broth supplemented with chloramphenicol (30 μg/mL). Cultures were incubated with gentle shaking at 25° C. for 12 h. A MUF analog was added to a final concentration of 1.5 μM and incubated with shaking at 25° C. for an additional 24 h. Cleavage of the analog was visualized using long-wavelength UV light.

Sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) and zymogram analysis. Concentrated culture supernatants of M. degradans strain 2-40 were prepared from 50-mL cultures grown at 25° C. for 50 h in minimal medium without glucose and supplemented with 0.1% chitin. All subsequent steps were performed at 4° C. Cultures were centrifuged at 10,000×g for 20 minutes and then sterilized by filtration through a 0.22-μm-pore-size filter. The filter-sterilized supernatant was then concentrated 100-fold using a centrifugal concentrator with a 10-kDa cutoff filter (Millipore). Proteins in concentrated culture supernatants were fractionated by SDSPAGE with a stacking gel in an 8% acrylamide separating gel with a final concentration of 0.01% glycol chitin. Gels were then incubated in refolding buffer (50 mM Tris, 1 mM EDTA, 5 mM 2-mercaptoethanol [pH 7.5]) at 4° C. for 24 h. Gels were washed for 1 h in 100 mM sodium phosphate buffer (pH 7) at 25° C. and then incubated in 100 mM sodium phosphate buffer (pH 7) for 16 h at 37° C. Gels were rinsed and washed in developing buffer (0.5 M Tris, 0.01% Calcofluor [pH 7.5]) for 5 min and then rinsed with distilled water for 2 h with frequent changes of wash water. Zones of chitin depolymerase activity appeared as dark bands when viewed under UV light.

Protein expression and purification. Genes of interest were amplified using PCR and tailed primers. Each gene was digested with the appropriate restriction enzyme, ligated into the pETBlue-2 or pMal-2pX expression vector, and transformed into E. coli Tuner or E. coli DH5αE cells. A 50-ml culture of transformants carrying the clone of interest was grown at 37° C. to a optical density at 600 nm of 0.5 to 0.6, induced with isopropyl-β-D-thiogalactopyranoside (IPTG), and grown for an additional 3 hours at 37° C. Cells were harvested and resuspended in lysis buffer, and clarified lysates were prepared. pETBlue-2 His tag fusions were bound to Ni—NTA agarose, and pMal-2pX maltose-binding protein (MBP) fusions were purified using amylose resin. Fusion proteins were eluted with imidazole and maltose solutions, respectively. Fractions of interest were concentrated using centrifugal concentrators with 10-kDa cutoff filters, aliquoted, and stored at 80° C.

Chitinase activity assays using chitin analogs and chitooligosaccharides. To determine the specific activities of chitin depolymerases against the chitin analogs MUF-diNAG and MUF-triNAG, 900 μL of a 50 μM solution of each chitin analog was added to 100 μL of purified enzyme and incubated at 37° C. for 30 minutes. The fluorescence of the reaction (excitation wavelength, 365 nm; emitted wavelength, 460 nm) was determined using a Hoefer TKO 100 fluorimeter and compared to a standard curve prepared with 4′-methylumbelliferone. One unit of activity is defined as 1 μmol of 4′-methylumbelliferone released per mg of purified enzyme per min. Products of chitooligosaccharide-modifying enzyme reactions were identified by thin-layer chromatography. Enzyme reaction mixtures contained 0.45 μmol of chitooligosaccharide substrate in 10 mM Tris HCl at pH 7.5. After 1 h at 30° C., reactions were stopped by boiling for 10 minutes. Degradation products were fractionated on silica gel plates, which were developed in 2-propanol:ethanol:distilled water (5:2:1) for 1 hour. The plate was air dried and sprayed with 10% sulfuric acid in ethanol. The plate was dried and baked at 120° C. for 20 min. Chitooligosaccharide spots appeared brown and were compared to standards composed of chitooligosaccharides of known sizes.

DNA and protein sequence manipulations and analyses. Protein modules and domains were identified using the Simple Modular Architecture Tool (SMART) and pFAM database (www.smart.embl-heidelberg.de). Similarity searches were performed using the BLAST algorithm at the National Center for Biotechnology Information (NCBI) server (www.ncbi.nih.nlm.gov). Type II secretion signals were identified using the iPSORT program (www.hypothesiscreator.net/iPSORT) and the SignalP version 1.1 program (www.cbs.dtu.dk/services/SignalP). Multiple-sequence alignments were performed using the ClustalW program (www.searchlauncher.bcm.tmc.edu). Estimated protein molecular masses were calculated using the Peptide Mass Tool at the ExPASy server of the Swiss Institute of Bioinformatics (www.us.expasy.org).

Complementation of a nagA mutant. The M. degradans strain 2-40 nagA gene was amplified using PCR and tailed primers with 2-40 genomic DNA as the template. The amplified DNA and pBluescript SK+(Amp^(r)) were digested with the appropriate restriction enzymes and ligated using T4 DNA ligase to create pNagA. E. coli K-12 strain IBPC531 (nagA::cm) was transformed with pNagA and plated on GlcNAc-containing minimal medium, which contains M63 minimal salts, 0.2% GlcNAc, ampicillin (50 μg/mL), and chloramphenicol (30 μg/mL).

Cloning and Expression of GH18_(N) and GH18_(C). Oligonucleotide primers were designed to amplify the nucleotide sequence corresponding to each catalytic domain by PCR using purified M. degradans genomic DNA as a template. Each amplified fragment was then digested with the appropriate restriction enzymes and ligated into the protein expression vector pETBlue2 using T4 DNA ligase. Expression constructs were verified by sequencing and transformed into E. coli Tuner™ DE3(pLacl) cells. Protein expression was performed according to standard protocols. Cells were lysed with BugBuster™ NT lysis buffer, centrifuged, and the supernatant collected. Supernatants containing recombinant enzymes were applied to a Ni-NTA agarose column and purified according to the manufacturer's protocol for native protein purification. Purified enzyme samples were quantified using a BSA protein quantification kit.

Glycol chitin zymography. Ethylene glycol chitin was incorporated into the separating portion of an SDS-PAGE gel to a final concentration of 0.01%. After fractionation of the proteins, the zymogram was incubated in refolding buffer (50 mM Tris-Cl, 1 mM EDTA, 5 mM 2-mercaptoethanol, pH 7.5) overnight at 4° C. and subsequently analyzed for chitin depolymerase activity.

Enzyme assays using chitin analogs. Solutions of 4′-methylumbelliferyl-N-N′-diacetylchitobiose [MUF-diNAG] and 4′-methylumbelliferyl-N-N′-N″-triacetylchitotriose [MUF-triNAG] were prepared in 50 mM sodium phosphate buffer (pH 7.0). Reaction mixtures contained 2 μg of purified enzyme and 30 μM analog solution. After incubation for 5 to 10 minutes at 37° C. for GH18_(N) or 5 to 20 minutes at 30° C. for GH18_(C) reactions were stopped by submersion in an ice water bath. Liberated methylumbelliferone was detected using a Hoefer TKO-100 fluorimeter. The reaction was measured at multiple time points between 5 and 20 minutes and was found to be linear, with less than 10% of the substrate being degraded.

Oligosaccharide electrophoresis. Reactions of chitooligosaccharides were incubated with 2 volumes of labeling solution (1.0 M sodium cyanoborohydride, 0.2 M 2-aminobenzoic acid) and dried under vacuum. Each sample was mixed with standard 2×SDS-PAGE loading buffer and fractionated in a 15% polyacrylamide gel at 45 mA constant current. Labeled oligosaccharides were visualized under UV light.

Determination of reaction optima for each domain. MUF-diNAG or MUF-triNAG was added to 20 μg of purified enzyme and incubated at a given pH or temperature and activity detected as described above. The buffers used were: sodium acetate (pH 4.0 to 5.5), MES (5.5 to 6.5), PIPES (6.5 to 7.0), HEPES (7.0 to 8.0), and Tris base (8.0 to 9.5). For a given enzyme, reaction conditions that permitted maximum activity were assigned a value of 100%. EDTA, EGTA, KCl, NiCl₂, SrCl₂, MgCl₂, MnCl₂, CuCl₂, CaCl₂, or HgCl₂ were added to reaction mixtures to a final concentration of 10 mM; NaCl was added at concentrations up to 1.0 M. Reactions containing metal ions contained 200 pmol enzyme and were incubated for ten minutes at 37° C. for GH18_(N) or twenty minutes at 30° C. for GH18_(N).

Enzyme assays using chitin and chitin derivatives. Purified enzyme and substrate (2mg chitin or 10 nmol chitooligosaccharide) were added to 50 mM HEPES, pH 7.5 and incubated at 30° C. The amount of reducing sugar generated was determined by the DNSA assay. Specific enzyme activity was estimated by comparison to a standard curve.

Protein sequence analysis. Analysis of protein domains was performed using the Simple Modular Architecture Research Tool. Similarity between proteins and protein domains was determined by the BLAST algorithm. The lipoprotein anchoring site within ChiB was identified using the Database of Bacterial Lipoproteins.

The nucleotide and protein sequences of ChiA, ChiB, ChiC, CbpA, CdxA, HexA, HexB, and HexC have been placed in GenBank under the accession numbers shown below: Gene Accession No. chiA BK001043 chiB BK001042 chiC BK001044 cbpA BK001045 cdxA AY233270 hexA BK001046 hexB BK001047 hexC BK001048

It is to be understood that while the invention has been described above using specific embodiments, the description and examples are intended to illustrate the structural and functional principles of the present invention and are not intended to limit the scope of the invention. On the contrary, the present invention is intended to encompass all modifications, alterations, and substitutions within the spirit and scope of the appended claims. chiA polynucleotide sequence 1 mfkktlavag lalaannafa atncsdltdw nsstaytggt svkhanskyt aqwwtqgadp SEQ ID NO: 1 61 tshsgqwqew kfidqcssss ssssssssss ssssssssss ssssssssts sssssssssg 121 gsctdapvfa entayntgdv vtnlenlysc wpgwcklgg ayepgqgwaw ehawnhvgtc 181 gtssssssss stssssssss ssssssssss ssssggvggg kvpahslvgy whnfvngagc 241 pmrisemsdk wdvidiafad ndpasngtvh fnlfpgtgnc pamnaeqfka dmralqaqgk 301 vfvlslggae gtitlntdad evnfvnsltn linewgfdgv didlesgsql lhgsqiqarl 361 itslrtidan vggmvltmap ehpyvqggyi aysgiwgayl piidalrdql dllhvqlynn 421 ggilspynpq tfpagsvdmm vasarmlieg fntgdggyfq glrpdqvslg lpsgpssags 481 glatnqaimd aldcitrgth cgtidaggiy psfngvmtws inwdahdgyi fsnpigdkvh 541 slp chiB polynucleotide sequence 1 atgaatttaa ctaaatttgc agtggctgca cttagtgttg ccgtactttc tgcatgtggc SEQ ID NO: 2 61 ggaggcgccg gtaacagccc tagccccggt gcaggttcca atacaaatac tgagtcggca 121 tctagcagct ccagttccag ctctagttct agcacaagtt caacatccag ttcttcttcc 181 agctctagtg gttcagcaga agtaaatgta gatattgacg ttgatatcga tgtggaaaac 241 ggctctagtt cgagcagctc atcaggctct agctcgtcta gcacgggcgg tggcgatatt 301 actattattg acgaaataga gagctcgacc agttcttcta cgtctagctc aagttccagt 361 ggcgcaacaa gttcaagcag tacttcttcg tctagcagtt cttcaagcag ctctagttca 421 tctggcgcta ccggctcgtc atctagcagc tctggtgcgg gtagtactag ttcatcatca 481 agctctagta gctcaagttc gtcttctagt tcatcgtcaa gttcttcaag ctcttctagt 541 tcatcaagca cgggcggtgg caatgcgggt gtagatgccg aattgggtta cagcattggc 601 gacgtctatg cgccaagctt tgattacacc gcagtaggcg gcgagcgcaa aacagataac 661 taccgcgtta ttggctatta catgccaagt ttagatggtt cgtttccgcc tagcgcaatt 721 ggtgagcaac aagcgcaaat gcttacccat attaactatg catttattgg tattaacagc 781 cagctagagt gcgattttat agatgtagaa aaagccgacg cagaaactca aattattgct 841 gagttacaag cactaaaaaa ttggaatgcc gatttaaaaa tccttttttc tgtagggggt 901 tgggcagaat ctaacgacgc agccgaaacc gttagccgct accgcgatgc gtttgcaccg 961 gcaaaccgcg agcattttgt tagctcgtgt gtagccttta tgcaacaaca cggctttgat 1021 ggcatagata tagattggga ataccctcgc gccgaagatg tagataactt tattgccggc 1081 ctagcagcaa tgcgcaacca attggatgca cgcggcaacg gcgagctagt taccattgct 1141 ggcgcaggcg gtgcgttctt tttaagccgt tattacagca agctagctgc catagtagaa 1201 cagttagact ttataaattt aatgacctac gacctaaacg gaccgtggaa cggcgtaaca 1261 aaaactaact ttcacgcaca cctgtacggc aacaaccaag agccgcgctt ttacaacgcg 1321 ctgcgcgaag cagaccttgg tttaacgtgg gaagaaatag tagagcgttt tcctagcccg 1381 ttcgagctca ccgtagatgc cgccattaaa caacatttaa tgatggatat tccgcgcgaa 1441 aaaattgtaa tgggcgtacc tttttacggt cgtgcatttt ttaacacagg ttcatcaaac 1501 accggtttat accaaacctt taacacccca aatggtgacc cctatgtagg tgacgctagc 1561 ttattggttg gttgtgaagc ctgcgaagcg cgcggcgagc cacgcattgc tacctttaac 1621 gatattcaac aacttataga aggtaactac ggctataccc gtcactttga tgatcaaacc 1681 aaagcgcctt ggttgtatca cgcagaaaat aatatatttg taacctacga cgatgctcaa 1741 tcgttggtgt ataaaaccga ttatattaaa caacaaggtt taggcggtgc gatgttttgg 1801 cacctaggcc aagatgattc gcaatttact ttattggcta ctttacacac cgagctaaac 1861 ggcgcaaacg ctggtagcct gcaaggtggc aatagcgaaa ccgacaacac aacggacgaa 1921 acagaaggca ataacgaaga caacaccgaa caaaacccag aagaaaatac cgatactgaa 1981 gaaacagaaa cagaaacaga aacagaaaca gaaacagaaa cagaaacaga aacagaaaca 2041 agcgtagagc aacccactgc gccaacaata gcttggatga acacaagcta taccggcagc 2101 agtgtaacgg tcactattac gtggaatatg tactggggta caaacggcaa ccaatggcag 2161 ctatggttag atggcgagca agtgtattca gccaacttaa ctaccaatgg ccaaaatgca 2221 caaaccgaca gcaaaatcat tactattact ggcgcaggtg ctcatagcgt tgaagttaaa 2281 ctgtgtaacc agcaagatat aaatgttagc tgtgctagcg atagcgaaac tatcactttg 2341 caaggcggta gtgatggcgc aacgtctagt tcttcttcca gcacgtcgtc aagctctagt 2401 agctcgtctt ctagtactgg tggttcaacg tcgagcacaa gcagctcctc tagttctact 2461 agttcatcga gcagttcatc tagctctagt agttcaagta catcgggtgg cggcgaaaca 2521 gatttatctg gcgtggttta cggcgagtac aacaacactt acaaacagac gagcgataaa 2581 ataattgtta cttactttgt agagtggggc atttatggcc gcgactatca cgtaaataat 2641 attccggcgt ctaaccttac gcacgtactg tttggcttta ttgcaatgtg tggcgataac 2701 ccacacgcct caggcggcgc gcaagcggct attgctagcg agtgtgcaga taagcaagat 2761 tttgaagtta ccttggtaga tcgtttcgcc aacctagaaa aaacttaccc aggcgatacg 2821 tggtacgacg atacaaccgg tcaagattac aatggtaact ttgggcaact acgcaaacta 2881 aaagcacagc acccgcattt aaaaatattg ccatctattg gcggctggac aatgtctacc 2941 ccattttatg aaatggcaaa aaatgaagct aaccgcgcag tgtttgttga atctgccgtt 3001 aactttatta aaaaatatga cttcttcgac ggagtagata tagattggga ataccctgta 3061 tacggcggta cagccccaga attatctacc gctgccgacc gcgatgccta taccgcctta 3121 atgcgtgacc tacgcgcagc attagacgag ctggcagaag aaacgggtcg cgaatacgaa 3181 attacttcgg ccgtaggtgc agcaccagaa aaaattgcag cagtagatta cgccagtgcc 3241 acaacgtata tggattacat attcctaatg agctacgact acatgggcgc atgggcgaac 3301 acaacgggtc accacacccc gctgtacaac aacaacgaag agcgagaagg ttttaacaca 3361 catgcgtctg tgcaaaacct attaaccgca ggtgtgcctt catccaaatt agtcgtgggt 3421 ggtgcattct acggccgcgg ctgggtaggc acccaaaata ccaacgctgc caaaagcgat 3481 ttattcccgc tatatggcca agcttctggc gcggcaaaag gcacctggga agcaggggta 3541 caagactacc gcgacctgta cgacaactat attggcacca atggcacagg cattaatggc 3601 tttagcgcac actacgacga aatagccgaa gccgcctacc tttggaacag cagcaccggc 3661 gaatttataa gctacgattc gccgcgctct attgcagcaa aagccgatta cgtaaaacaa 3721 tacaatctag ctggcatgct aacctgggaa atagacggcg ataacggcca actactcaac 3781 gccattaacg aaagtttcgg caacgaaaag cagtag chiC polynucleotide sequence 1 atgaacccta tagctaaact cacattagcc actggcgcca tgctaagtgc gcatgtggcc SEQ ID NO: 3 61 tacgcttacg actgcgatgg ccttgccaca tggaacgcat cgtctgccta tgccggctct 121 accgttgtgc aacacagtaa cgtggcttac aaagccaact ggtggacaca aaaccaaaac 181 cccgcttcac attctggccc ttggcaagag tggacgaacc taggcaactg cgatggcgac 241 ggtggcggca acaccaacca agcgcccagc gcaaatgcca acggccccta cgccgcgcaa 301 cttggcgccg ccatagcgtt tagctctgca ggctctagcg atagcgacgg caatattgcc 361 agctacaact ggacctttgg cgacggtaac agcagcaacc aagctagccc aagccacacc 421 tatggcagcc aaggcaccta cgcggttacc ttaaccgtta ccgataacga aggcgcaagc 481 agcagtgcca ccacaagcgc aagcgttacc caaggcggag accctggcga ttgccaagca 541 ccgcaataca gtgcgggcac ccaatacgct gcgggcgata tcgttgccaa tggcggcaac 601 ctgtaccagt gtaatattgc gggctggtgc tcttcatctg ccgcatgggc ctatgcccca 661 ggtactggcg cacactggca agatgcgtgg tcacttacga gcgaatgcga cgacaacggc 721 aacaccaacc aagcacctac agccaatgct aacggcccat attctggtag cgctggtata 781 agcattagtt ttagcagcaa tggctctgcc gacagcgacg gcacaattgc cagttacagt 841 tggaactttg gcgacggcgc aagcagcagc caagcaaacc caagccacag ctacatgaat 901 gaaggcactt accaagttag cctaaccgta accgatgacg acggcgcgag cgccaccgca 961 ttcaccaccg ctaacgtaac tggtaatggc gaaaaccaag agcctgttgc aagcattagt 1021 gcaccatcca gcgctagcga aggcgctagt gtgaactttt ccagcgcggg cagtaacgac 1081 ccagacggca gcatagttag ctacagctgg aactttggcg atggcactag cagtcaacaa 1141 gctaacccca gccacaccta cagcagcgca ggtagctata gcgttagcct aacggttgtt 1201 gataacgaag gcgcgaataa cgtcgccaac cacagcatta caatcagtgg cgataccggc 1261 ggcggtacac acggcgataa aattattggc tacttcgcag agtggggcgt atacggccgc 1321 aattatcacg ttaaaaacat tcacaccagc ggctctgccg acaaactcac tcacatcgtt 1381 tacgcgtttg gcaacgttca aaacggcgag tgtaaaattg gcgattccta cgcagcatac 1441 gacaaagcct acagcgcagc agacagtgta gatggcgttg ccgatacttg ggacgacggt 1501 gtactgcgcg gtaacttcgg tcaactacgc cgcttaaaag ccatgcaccc acaaattaaa 1561 atagtgtggt ctttcggtgg ctggacatgg tctggcggtt ttggcgaagc agcagcgaat 1621 gccgatcact ttgccaactc ctgttacgac ttagtattcg acgcacgctg ggcagacgtt 1681 ttcgacggca tcgacatcga ctgggaatac cccaacgact gcggcctaag ctgtgataat 1741 agcggctacg atggctaccg cgtactcatg caagcattgc gcaatcgttt tggcaacaaa 1801 ctagtaaccg ctgccattgg cgctggcgaa tctaaacaaa atgcagccga ctacggtggc 1861 gcagcacagt acttagattt ttacatgcta atgacctacg acttcttcgg cgcatttaac 1921 ccacaagggc caaccgcacc gcactcaccg ctatacaact acccaggcat gccaatagaa 1981 ggattctctt ctgaccacgg tatccaagta cttaaaagca aaggtgtacc tgccgagaaa 2041 atcttactgg gcataggctt ttacggccgc ggctggacca acgtaacgca agatgcccca 2101 ggcggcagcg ctaacggcgc agcacctggc acctacgaaa aaggcattga agattacaaa 2161 gtgttgaaaa acacctgccc agccaccggc acaattgccg gcaccgctta cgccaaatgc 2221 ggaagcaact ggtggggcta cgacacacca gccaccatcg atagcaaaat ggactacgcc 2281 aaacaacaag gcctaggcgg cgcgttcttc tgggagctaa gtggcgacac caccgatggc 2341 gaactgatta gagcgattga taatggctta aaaaactaa cbpA polynucleotide sequence 1 ttgcaaccga taaaatcaac taaaaggaac ctaatcatgt tcgcaaagaa aattacatac SEQ ID NO: 4 61 tccactatag ccttggccat cgcagggctt tctggtaacg cactatctca cggcttaatg 121 gtagacccgc cttcgcgtaa cgcgctgtgt gggatgatag aaaaacctga ccaagcaaca 181 tcacccgcct gccagcaagc tttccaaaat gactttaatg gcggctacca atttatgagc 241 gtgctaaccc acgacatagg tcgccaaggc ggcacgtcta ataatgtgtg tggctttgat 301 agcgaaacct ggaatggcgg tgcaaccccg tgggatgccg caattgattg gccaaccact 361 caaattagtt ctggcccgtt agaaatagat tggaatattt cttggggccc tcactgggac 421 gacaccgaag agtttgttta ctacattacc aagcctgact ttgtatacca ggtaggtgta 481 ccgctcagct ggagcgattt cgaggcaaca cctttttgcc aactcgacta cagcgatgca 541 aacccaaacg caaaccctgg cgtatccacc accaaaagtg ccaacctatt tcacactcaa 601 tgtaacgtac ctgcgcgctc tggccgccac gtgatttacg gtgaatgggg gcgcaactac 661 tttacctacg agcgattcca cggctgtatg gatgttacct ttggcggtag caacccaccc 721 cctagcaacc aagcgccaac agctaacgct caatctgtaa atgtaagtag cggtagcagt 781 gtctctatta ccttaagcgg cagcgatgta gatggtgtta ttagcagtta cgcaattgca 841 gcagcaccta gtaacggaag tttaagcggg tctggcgcgc agcgtttata cacacctaat 901 ggcaatttct cgggttcgga tagcttccaa ttcacagtaa ccgatgatga cggagcaaca 961 tccaatgccg cgaccgttag cattaatgta agctctcaac cagaaccaga acccgaaccc 1021 gagccagaac cagagcccga accaggaact ggcgctagct gtgagcacgt tgttgtaaat 1081 gcttgggata gtggcttcca aggcgctatt cgcataacta acactagcga ccaaaatatt 1141 aacggctgga atgtaagctg gagctacaac aatggcacta caattagcca gttgtggaat 1201 gcaaacttct cgggcagcaa cccttacagc gcaagcaacc taggttggaa cgcaaccatt 1261 caaccaggcc aaactgttga atttgggttt accggtaacg gctctgtacc cgcggcacca 1321 gcagtaacgg gtgcggtttg taattag cdxA polynucleotide sequence 1 atgaaaaata agcactgcct agccgctttg gcgctggcga tttctaccca tgcgtatgcc SEQ ID NO: 5 61 gcacctggca cgcccaatat tgcgtggctg cccgctaccc acgaaagtgg cgaagccata 121 aacgtacatt gggatatgtg gtggggtgaa aacggcaccg agtggcaatt aaccgataac 181 ggcgacctgc gctgcagcgg cagcctaaca gccaacggcc aaaaccaaca aagcgcggaa 241 tgcgccgcta actacagcag cggcagccat gcactgcagg ttagcttgtg taataccagc 301 ggctgtagcg aaagtaatgt tgttactatt aacgttaacc aaggcgcaag tagcaacgtg 361 ccacctcaag tatccattag cgcaccggca agtgcagggg agggggactc gataaccctt 421 agcgctacgg ccagcgacag cgacggcacg attacctctg taaccttttt agtcgatggt 481 attgccatag ctaccgatac caccagccca tacagcacaa actggatagc gaaagcgggt 541 actcactcac ttaccgcgca agcgctagat aaccaaaatg ccacaggcga tgattctgta 601 agtattagcg ttaccagcgc ccctaaccaa ttgcccagcg tgagcttggt tgctcccaat 661 gcaaacttaa tggcgggcag cgagaccagc tttgaaataa acgctagcga cgccggtggc 721 agtattagca gtgttgaatt gtacttaaac ggcaatttac tcggcaccga taccagcgcg 781 ccttacaacg ttagctggac agcagaagcg ggcgatcaca gcatttacgc cgtagcaagc 841 gacgatcgcg gcggtgtgag tcaatcggac acggtatttt taaccgtagc ggaagacaca 901 aatgcagcgc ctagcgtaag cctttcaacc gtaccaacag acgcaatgga aggtgatgca 961 ctcacacttg aggcagcagc aagcgacagc gatggcagtg ttgcgcaggt ggacttttac 1021 ctaaacaacc aactactagg cagcgccaca agcgcaccct acagtttgca atggacagcc 1081 acgcgcggca gccacacctt gcgcgcaacc gctgtggata accaaggtaa aacagccagc 1141 gcgattagca cctttagcgt tgctgcagac acaagcgcca gccacgaaga ctgccgacca 1201 gacgggcttt acgccacgcc agaagtgcaa tcgccttact gtactgttta cgacatacaa 1261 ggccgcgagc taatgggcag cgcaacgcgc cgcgtgattg gttacttcac tagctggcgt 1321 actggtggta acggcccggc ctaccttgca caccaaattc cctgggacaa gctaacccac 1381 attaactacg cctttgccca tgtggatggc aacaaccacg tttcaattgg cgccaatacc 1441 ccaaccaatg cagcaacggg tatggaatgg ccagacgtag ccggtgccga aatggaccca 1501 agctttagtt acaaaggcca cttcaacctg cttaacaaat acaaaaagca gtacccacac 1561 gttaaaacgc ttatctctat tggcggttgg gcagaaacag gcggctactt tgatagcaat 1621 ggcgaccgcg taaattctgg cggcttctac accatgacca ccaatgcaga cggttcggtt 1681 aacaccgccg gtatcaacac ctttgccgac tcggtagtgg agtttttacg cacctacagc 1741 tttgatggcg cagatataga ttacgaatac cccacatcga tgaacgatgc cggcaaccct 1801 tcagatttcg ccatcgccaa tgcgcgtcga aaaggcttaa acgcttcgta caacgtgttg 1861 atgaaaaccc tgcgccaaaa gctggatata gcaggggagc aagatggcaa gcactacatg 1921 cttaccatcg cctcgccatc gtcaggctat ttgttgcgcg gcatggaagc atttgaagca 1981 acccagtact tggactacgt caatatcatg tcctacgact tacacggtgc atggaaccag 2041 tttgtaggcc ccaatgcggc actgtttgat aacggccaag atgcagagct tattcagtgg 2101 aacgcttacg gcggccagta caaaaatatt ggctacctca acaccgactg ggcttaccac 2161 tacttccgcg gcgccatgcc ggcgggccgc attaacattg gtgtacctta ctacacccgc 2221 ggctggcagg gcgtaaccgg tggcaccaac ggtttatggg gccaagcatc cctgccaaat 2281 caaagcgaat gccctgtggg taccggcggc agcgccacca gtaaatgcgg caacggcgcg 2341 gtgggtatag ataacctatg gcacgacaag gatgaaaacg gcaacgaaat gggcgcgggt 2401 tctaatccca tgtggcacgc taaaaaccta gaaaacaata ttctagggga ttacctaaca 2461 gcctacggct tagacccaat caacaaccca gatcaccaac ttagcggtaa ctaccagcgt 2521 tattacgacg atgtattagt cgccccgtgg ttgtggaacg ccgctaagca ggtatttatc 2581 tctaccgaag acgagcaatc catcaaccgc aaagccgatt acgtagtaga aaacggcata 2641 ggcggcatta tgttttggga actagccggc gattaccaat tcaatgcggc caagggccaa 2701 tacgaaatgg gccacacgct aaccaccgcc attgcagata aatttgccaa cgcgccagcc 2761 tacggcaacc agcgtgcaga aattgatatg ccccagcaaa cgttagatat aggcataaag 2821 ctaactaact ttgccttggg tgataacaac ttccccatta cgccagacct aataattact 2881 aacaacacag gccaaaactt gcccggcggc accgagttct atttcgatat cgccacctct 2941 accccagata acatgggcga ccaaagcgca gcgagcttaa ccattgttag caacgggtct 3001 aacgcggcgg gtaacaatgt gggcggttta gaaaacaact tccaccgcgt aaaaataagc 3061 accccaagct acctcaccct tgccgacggc gaagaatgga aagtagtact taaatactac 3121 ctaccagttt ctatgccttc taactgggtg gttaacgtag ctggcgaaga gtttgcgctt 3181 agcagcgagt accctaactt gccgatgggc agcattagtt ctggtggcgg caataacggc 3241 ggtggcaaca ccggtggcga ttgcagcaac gcaagcgact acccagctta ccctaacttt 3301 ccacaaaaag actgggccgg aaaccccagc cacgccaacg ccggtgaccg catgacccac 3361 aacaacgcgc tgtatgaagc caaatggtgg acaagtgcaa ccccaggtac atccgattgg 3421 gacttggtat gtacgtttta a hexA polynucleotide sequence 1 atgaaactaa gattattacc acatagtata agtttagcat cgctattact gctaagtgct SEQ ID NO: 6 61 tgccagcaag agcacgcaac cagtacaaac gcgcaactct cccctattgc accgcctgct 121 atctctattg ttcccgcacc ggtttcggca gaaataaaaa cagggcagtt tgtttttggt 181 aatagcacac agcttacagt taacagcgaa aagctaagag atgttgcgca gctttgggcg 241 gattttttta atgttgctag tggtattaat ttacaggttc aaagcgctac aggtaatagc 301 gatgaagcaa atagcgtaag tcttgagttg gtgccggctt cagaattctc atcaagcaat 361 gcagaagcct atgaattaac ggttacagat aatgcaataa cagtacgcgc tagcactcgc 421 gcgggtattt tttacggctt aaccagtttg cgccagttat tgccgccgca aatagaatca 481 ccctccccta ttaattctgt aaattgggtt gtacctgcgg ttgctattgt cgacgagccc 541 ttatacccct atcgcggtat gcacttagat gtaagccgcc actttttcga tgtgaatttt 601 attaaacgct atatagatat attagcgttc cacaaaatga atcgtttcca ttggcattta 661 accgatgacc aaggctggcg tattccgatc gacgcctacc ccctactcac agaaaaatcg 721 gcttggcgag acaaaacggt tataggccat acctacgacc gcgacgtagc ttacaacact 781 aatagaatag gcggttttta tagcaaagaa caaatacgag acatagttgc ttacgctgca 841 gaacgccaaa ttatggtaat tccagaaata gatgtccccg gccacgcagc agctatttta 901 cacgcttacc cagagtttgg ttgtatcgag caagtttcac aggtgcaaag caactttggc 961 attttcgagc aagtgctttg cccaaccgag ccaacctttg aatttttgcg cgcagtgttt 1021 accgaagttg ccgagttatt ccctggcgaa tacctacatg taggtggcga cgaagtaaaa 1081 aaagttcagt ggcaacagtc accctttgtt accgaattaa tgcagcgtga aggtttaaaa 1141 gactaccacg aagtacagag ctactttatt tgccgcgtag gcgagatagt aagtagctta 1201 gataaaaaaa tgttgggctg gaacgaaata ctcgacgggg gtattgctcc caatgcgact 1261 attatgtctt ggcaaggtgt tgaaggtggt attgctgccg ccgagctggg ccacgatgcg 1321 attatgtcgc cgggaaacta tgtgtacttc gatcactttc agtctcgctc ggtggatgaa 1381 ccacttgcca ttcacggtat tacaccgtta tcagaaacat actcttacaa ccccatgccc 1441 gaacaatttg ctggcacaga aaaagccaag cacatactcg gcgcccaagg gcaactgtgg 1501 acagagtacg tgcctaccac agcaaaagcg gagtatatga tactgccaag attaagtgcg 1561 gtagcagaaa taacctggac accagtcaac aagcaatcgt ggcaaagctt tagcgaaagg 1621 ctacccagcc tatttgcccg ctttgacgaa atgggcttaa acgcagcgcg atctgtttat 1681 gcaattaccg ctaccgcaaa aacggaaggc agcggtgaag atgccaaata ccgcgtaaac 1741 cttgcctccg atacggctca tgtaattatt cgctacacaa ccgacggcac cttgccgaat 1801 gcgcaatcgc ctatttatag cgaaccattt ttagtagaag gcgatacgtt tgtgagggcg 1861 cgtagccaag ataaaataag tggtaacttc tacctggaat cgcaactgcg caccgtaaaa 1921 cacaaagccg ttggcgccaa gctaacactg ttaagcgaag cgaatacaga gtggaataaa 1981 gacccagtaa aaaccttaag tgatggcatt acttcgatag accaaatatt tcaactcgac 2041 gactgggcca cattttttgg cgacgaggtt gttgcacata taaccttcgc taaggcacaa 2101 accgttagcg aagtaagcat tggctttaac cctggcaagc atcgccaaat gtacccaccc 2161 actcgtttgc atattttaag ctcaagcgat ggcgaaacat ggcaaagctt gggtgaagcc 2221 gacccacaac accttgccac cgcaaaaaat cgcgtaagtt acacctttgc accaacaacc 2281 actcgccacc tacggataga ggcggaaaat aaaacccgcg tactaagtac cgaaagcggt 2341 aagctaaaaa gcgttcccct atacttagat gaaataatcg ttaaataa hexB polynucleotide sequence 1 atggcgttat ttagcaagta tgtatggcaa gtggcagttg ccggagcatt aggtacggtt SEQ ID NO: 7 61 agtttgctgg gtagtcgttt atacgcgcaa actgcagata cacagcaatg gattgatggc 121 atagccagca atatgcaggt gcattatcaa gtactgctaa ataagggtga cggcgaatgc 181 agcttgccaa gcttaccgcc cagccccaaa tcaccatgct ctatagttga gctttcactc 241 agctcgccag ataagcttgc ggcaaacgac ttagatggta actggtctat ttacttcagc 301 caaaccgatc ccatttatgc gcacccagct ggtgaattta caatcgacca tataaatggc 361 gatttacacc gaattcgccc cagcgccagc taccaaggat ttaatgtggg cgaagttaaa 421 aaggtgcagt ttattgtggc gggtttaacc cttaccgaag ccaaaataat gcccaactat 481 tatgtggtag cagaagggca agataataaa caggcactat acagcgaagc ccgtgttatc 541 gaatcaacac gtattcgtat acacccagaa acagggttag aggagcgacc ttttgcaggc 601 gaaataagta ggcaaaattt taagctgtcg caggcagata aaacgcctta cgccgatgcc 661 gcttttatat ttaacgaaaa taaaaacgta aataagctgg gatttgtagc gcaagacgaa 721 gcgctgcgca caataatacc tacgccaact tttgtaatgg actctggcaa aaatatagat 781 attagcgcag gtataaacct gcagctacag ggggtggagc aagacgcagt tgcgccggca 841 ctggcgtggc tacaagcatt gggcctaaag caaaaccctg cgggcatgcc gtttgttgtg 901 tctgtttcgc gggcgagctt accgtcgcgc tcgccagtgg ggtcctatca attggtggta 961 tcgccaacgc aaattaccat ctttgcccgc gaaccggttg gtgcgtttta cggtatgcaa 1021 tcgttggcga gtgtaatgat agcgggcaga aatactttac ctgtgttaac cgttaacgat 1081 tcgcctcgtt acccttatcg cggtatgcac atagatgtag gtcgtaactt tcattccaaa 1141 caacaaatac tggatgtatt agatcaaatg gcggcgtaca agcttaacaa gctgcatttg 1201 catttgggtg aagatgaagg ctggcgcttg caaataccca gcttgccaga acttactgat 1261 gtgggcggta agcgctgtca cgatccacaa gaaaacacct gcttattaat gcagcttggg 1321 gcagacgtaa gcggcaaaag tgaacgcgat ggctattaca ctcggcaaga ttatatagag 1381 ctagtaaaag ctgcgaatgc gcgtcacata cagttaatcc cttcttttga tatgcccggt 1441 cattcgcgcg ctgtaataaa agctatggag gcgcgttacc gtaaattcat ggccgctggt 1501 aataaaaaag ccgctgaaca atatttactt tcagacccaa acgataaaac gcagtacaaa 1561 agtattcagt tttattccga taacacgatt aacgcgtgca tggaatctcc ttataaattt 1621 ttaggcaaag taatagacga agtaaaagcc atgcacagcg aagcgggcca gccgcttacg 1681 gtttaccata taggcgcaga tgaaaccgcc ggtgcttggg cgcaatcgcc aatatgccaa 1741 gcgttttttg ccaacaaccc ttacggtgta gaaaatgcca aacagctagg tgcttatttt 1801 atcgagcgcg tggccgcatt attagaaact aagggtatta aaaccgcagg ttggagcgat 1861 ggtttaagcc acactaaccc aaaaaatatg cccgccaagg tgcaatcgta tatttgggat 1921 gtattacctt gggggggcgt tgccgaagca aataagcaag ccaatcgagg gtgggatgta 1981 gtgctatctc acccagacgc gctgtatttt gacttcccat acgagccaga cccaaaagaa 2041 ggcggctatt attggggcag ccgccatata gatacccaca aagtatttaa ctatatgccc 2101 ggtaacttac cggctttggc agaggtatac ccaagcccta cccaaacagg gtttgaaata 2161 gcaggcacca ccccattaaa acaaggcgtg caatgggcgg gtatccaagg ccagctgtgg 2221 agcgaaacta tacgcagcga taacgctgtg gaatatatga tctttccgcg tttaattgcc 2281 ttggcagagc gcgcatggca cgcaccaagt tgggagccgc cctacaatta cgagggcgca 2341 acctataatg ctaatagcgg tttattttct gaaaataaaa aaagtgagcg cgataaagcg 2401 tggttaaaat tcgcaagcgt cattggctac aaagaattcg ttaagctaga tgccgccgac 2461 attcactacc gcataccaac ggtgggcgct attattcaag actccatgct acacgcaaat 2521 cttgcttacc cagggttagg tattgaatat aaagaagccg gtaaagattg gcagccttac 2581 aacaagccag tacaagtaaa aacgccggta ctggtgcgcg caaaagccgc aacgggggat 2641 agaaaggggc gtgcgttacc tgttgagtaa hexC polynucleotide sequence 1 atgttggaga ctaacaatca attgcttggt ccagttattg cggatattgc cggtcaaact SEQ ID NO: 8 61 ctttccgatg aagatatagc gctaataaag aacccgctaa ttggcgggtt aatactgttt 121 acccgtaact attcaacccc ttcacagctt gacgcgctag ttaagcaaat tcgcagtgta 181 cgggcagata taattcttgc tgttgaccac gagggtggca gggtgcagcg ctttcgggaa 241 ggctttaccc gcattccagc tatgcaagta tttgccagcg cttataaagc tcgtgccgag 301 ttaacccttg cgcttgcctg taataccggc tggttaatgg ctagtgaact tcgcgcttac 361 gacatagata ttagctttgc accagtattg gatgtggatg atagttttag cagcattatt 421 ggcgatagag ctttttcttc agaccccaaa gctgttactg cgctagcggg tgcatttata 481 gacggtatgc aacaagcagg tatggcttgt accggtaagc attttcctgg gcatggcagt 541 gtgcgtgccg atagccattt agagctgcca gtggattatc gctcgctcga agctatagag 601 cagctcgatt taatgccttt tgctaagttg caaagtaagc ttgatgctgt aatgcctgcc 661 catatattgt tcccagaggt tgacgatcag cccgttggct tttcttctgt ttggctgcaa 721 aaaatattgc gcgataaaat ggcctacgac ggtgtaattt ttagtgatga tttgacgatg 781 gaaggtgccg ccgtggcggg tagcttcggg gagcgagcca taaaagcaat gagcgctggc 841 tgcgacacat tattggtttg caacaatcgc gaggccaccc ttgaggttat tcagacattg 901 gcagataacg gcaactattc tacctctatt cgattgacca gaatgcgggg gaaagcaggg 961 gcgcaaccta tttatgattt acacaataat aaacgctggc aagaaaccaa agaagcatta 1021 ctagcacttg cttaa ChiA polypeptide sequence 1 atgtttaaaa aaactttagc cgttgcaggg ctagctttag cagcaaacaa tgcattcgca SEQ ID NO: 9 61 gcaaccaatt gcagtgacct caccgactgg aatagcagca cagcctatac cggtggcacc 121 tcggtaaaac acgccaacag taagtacacc gcccagtggt ggacacaggg tgcagacccg 181 acaagccatt caggccaatg gcaagagtgg aaatttatag atcagtgctc ttcatcgtct 241 agctcaagta gctctagcag cagttccagc tcgtccagca gtagttcaag ctctagcagc 301 tcatcttcaa gctcttccag tagcacctct tcaagttcat ccagctcatc cagttctggc 361 ggcagctgta cagacgcccc cgtctttgca gaaaacaccg catataacac cggcgatgtt 421 gtaaccaact tagaaaattt atacagctgt gttgtacccg gttggtgtaa attgggtggc 481 gcctatgagc caggtcaagg ctgggcgtgg gagcatgctt ggaaccacgt aggtacttgt 541 ggtacgtcat cctcttcatc tagctcgtct tccacctcct ctagcagctc aagctcgtct 601 agctcatcca gttcatcaag ctctagcagc tcgtcgtcat ccggcggtgt gggtggcgga 661 aaggtgcctg cacactcact tgtaggctac tggcacaatt ttgttaacgg cgcaggctgc 721 ccaatgcgct taagtgaaat gtcggataag tgggacgtaa ttgacattgc ctttgccgat 781 aacgacccag caagcaatgg taccgtacac tttaatttgt tccccggtac aggcaactgc 841 ccagcaatga atgcagaaca attcaaagcc gatatgcgtg cgctacaggc acaaggtaaa 901 gtatttgtgt tatcgcttgg tggcgcagaa ggcaccataa ccttaaacac cgatgccgac 961 gaagttaatt ttgttaacag cttaactaac ttaattaacg agtggggatt cgatggtgta 1021 gacatagatt tagaaagcgg ctcgcaactt ttgcacggct cgcaaattca agcgcgcctc 1081 attacgtcgc tgcgcaccat tgatgccaat gtaggcggta tggtgttaac catggcacca 1141 gagcatcctt atgtacaagg tggctacatt gcttactcag gaatttgggg tgcgtatttg 1201 ccaattattg atgcgctgcg cgatcagttg gatctactgc atgtgcagct gtataacaat 1261 ggcggcatcc tatcgcctta taacccgcaa acgttccctg caggctcagt agatatgatg 1321 gttgcctctg cacgtatgct tatagaaggc tttaatacgg gcgatggcgg ttacttccaa 1381 ggtttgcgac cagatcaggt atcactaggc ttaccttctg gcccaagctc tgctggctct 1441 ggcttggcaa ctaaccaagc aatcatggac gcattggatt gtattacccg aggaacacat 1501 tgcggcacta tcgacgccgg cggcatatac ccgtcattta acggtgtaat gacgtggtcg 1561 ataaactggg atgcccacga tggctatatt ttctctaacc ctattggcga taaggttcac 1621 agcttaccgt aa ChiB polypeptide sequence 1 mnltkfavaa lsvavlsacg ggagnspspg agsntntesa ssssssssss stsstsssss SEQ ID NO: 10 61 sssgsaevnv didvdidven gsssssssgs sssstgggdi tiideiesst ssstssssss 121 gatsssstss ssssssssss sgatgsssss sgagstssss ssssssssss ssssssssss 181 ssstgggnag vdaelgysig dvyapsfdyt avggerktdn yrvigyymps ldgsfppsai 241 geqqaqmlth inyafigins qlecdfidve kadaetqiia elqalknwna dlkilfsvgg 301 waesndaaet vsryrdafap anrehfvssc vafmqqhgfd gididweypr aedvdnfiag 361 laamrnqlda rgngelvtia gaggafflsr yysklaaive qldfinlmty dlngpwngvt 421 ktnfhahlyg nnqeprfyna lreadlgltw eeiverfpsp feltvdaaik qhlmmdipre 481 kivmgvpfyg raffntgssn tglyqtfntp ngdpyvgdas llvgceacea rgepriaffn 541 diqqliegny gytrhfddqt kapwlyhaen nifvtyddaq slvyktdyik qqglggamfw 601 hlgqddsqft llatlhteln ganagslqgg nsetdnttde tegnnednte qnpeentdte 661 etetetetet etetetetet sveqptapti awmntsytgs svtvtitwnm ywgtngnqwq 721 lwldgeqvys anlttngqna qtdskiitit gagahsvevk lcnqqdinvs casdsetitl 781 qggsdgatss sssstsssss ssssstggst sstsssssst ssssssssss ssstsggget 841 dlsgvvygey nntykqtsdk iivtyfvewg iygrdyhvnn ipasnlthvl fgfiamcgdn 901 phasggaqaa iasecadkqd fevtlvdrfa nlektypgdt wyddttgqdy ngnfgqlrkl 961 kaqhphlkil psiggwtmst pfyemaknea nravfvesav nfikkydffd gvdidweypv 1021 yggtapelst aadrdaytal mrdlraalde laeetgreye itsavgaape kiaavdyasa 1081 ttymdyiflm sydymgawan ttghhtplyn nneeregfnt hasvqnllta gvpssklvvg 1141 gafygrgwvg tqntnaaksd lfplygqasg aakgtweagv qdyrdlydny igtngtging 1201 fsahydeiae aaylwnsstg efisydsprs iaakadyvkq ynlagmltwe idgdngqlln 1261 ainesfgnek q ChiC polypeptide sequence 1 mnpiakltla tgamlsahva yaydcdglat wnassayags tvvqhsnvay kanwwtqnqn SEQ ID NO: 11 61 pashsgpwqe wtnlgncdgd gggntnqaps anangpyaaq lgaaiafssa gssdsdgnia 121 synwtfgdgn ssnqaspsht ygsqgtyavt ltvtdnegas ssattsasvt qggdpgdcqa 181 pqysagtqya agdivanggn lyqcniagwc sssaawayap gtgahwqdaw sltsecddng 241 ntnqaptana ngpysgsagi sisfssngsa dsdgtiasys wnfgdgasss qanpshsymn 301 egtyqvsltv tdddgasata fttanvtgng enqepvasis apssasegas vnfssagsnd 361 pdgsivsysw nfgdgtssqq anpshtyssa gsysvsltvv dnegannvan hsitisgdtg 421 ggthgdkiig yfaewgvygr nyhvknihts gsadklthiv yafgnvqnge ckigdsyaay 481 dkaysaadsv dgvadtwddg vlrgnfgqlr rlkamhpqik ivwsfggwtw sggfgeaaan 541 adhfanscyd lvfdarwadv fdgididwey pndcglscdn sgydgyrvlm qalrnrfgnk 601 lvtaaigage skqnaadygg aaqyldfyml mtydffgafn pqgptaphsp lynypgmpie 661 gfssdhgiqv lkskgvpaek illgigfygr gwtnvtqdap ggsangaapg tyekgiedyk 721 vlkntcpatg tiagtayakc gsnwwgydtp atidskmdya kqqglggaff welsgdttdg 781 eliraidngl kn CbpA polypeptide sequence 1 mqpikstkrn limfakkity stialaiagl sgnalshglm vdppsrnalc gmiekpdqat SEQ ID NO: 12 61 spacqqafqn dfnggyqfms vlthdigrqg gtsnnvcgfd setwnggatp wdaaidwptt 121 qissgpleid wniswgphwd dteefvyyit kpdfvyqvgv plswsdfeat pfcqldysda 181 npnanpgvst tksanlfhtq cnvparsgrh viygewgrny ftyerfhgcm dvffggsnpp 241 psnqaptana qsvnvssgss vsitlsgsdv dgvissyaia aapsngslsg sgaqrlytpn 301 gnfsgsdsfq ftvtdddgat snaatvsinv ssqpepepep epepepepgt gascehvvn 361 awdsgfqgai ritntsdqni ngwnvswsyn ngttisqlwn anfsgsnpys asnlgwnati 421 qpgqtvefgf tgngsvpaap avtgavcn CdxA polypeptide sequence 1 mknkhclaal alaisthaya apgtpniawl pathesgeai nvhwdmwwge ngtewqltdn SEQ ID NO: 13 61 gdlrcsgslt angqnqqsae caanyssgsh alqvslcnts gcsesnvvti nvnqgassnv 121 ppqvsisapa sagegdsitl satasdsdgt itsvtflvdg iaiatdttsp ystnwiakag 181 thsltaqald nqnatgddsv sisvtsapnq lpsvslvapn anlmagsets feinasdagg 241 sissvelyln gnllgtdtsa pynvswtaea gdhsiyavas ddrggvsqsd tvfltvaedt 301 naapsvslst vptdamegda ltleaaasds dgsvaqvdfy lnnqllgsat sapyslqwta 361 trgshtlrat avdnqgktas aistfsvaad tsashedcrp dglyatpevq spyctvydiq 421 grelmgsatr rvigyftswr tggngpayla hqipwdkith inyafahvdg nnhvsigant 481 ptnaatgmew pdvagaemdp sfsykghfnl lnkykkqyph vktlisiggw aetggyfdsn 541 gdrvnsggfy tmttnadgsv ntaginffad swefirtys fdgadidyey ptsmndagnp 601 sdfaianarr kglnasynvl mktlrqkldi ageqdgkhym ltiaspssgy llrgmeafea 661 tqyldyvnim sydlhgawnq fvgpnaalfd ngqdaeliqw nayggqykni gylntdwayh 721 yfrgampagr inigvpyytr gwqgvtggtn glwgqaslpn qsecpvgtgg satskcgnga 781 vgidnlwhdk dengnemgag snpmwhaknl ennilgdylt aygldpinnp dhqlsgnyqr 841 yyddvlvapw lwnaakqvfi stedeqsinr kadyvvengi ggimfwelag dyqfnaakgq 901 yemghtltta iadkfanapa ygnqraeidm pqqtldigik ltnfalgdnn fpitpdliit 961 nntgqnlpgg tefyfdiats tpdnmgdqsa asltivsngs naagnnvggl ennfhrvkis 1021 tpsyltladg eewkvvlkyy lpvsmpsnwv vnvageefal sseypnlpmg sissgggnng 1081 ggntggdcsn asdypaypnf pqkdwagnps hanagdrmth nnalyeakww tsatpgtsdw 1141 dlvctf HexA polypeptide sequence 1 mklrllphsi slasllllsa cqqehatstn aqlspiappa isivpapvsa eiktgqfvfg SEQ ID NO: 14 61 nstqltvnse klrdvaqlwa dffnvasgin lqvqsatgns deansvslel vpasefsssn 121 aeayeltvtd naitvrastr agifygltsl rqllppqies pspinsvnwv vpavaivdep 181 lypyrgmhld vsrhffdvnf ikryidilaf hkmnrfhwhl tddqgwripi daypllteks 241 awrdktvigh tydrdvaynt nriggfyske qirdivayaa erqimvipei dvpghaaail 301 haypefgcie qvsqvqsnfg ifeqvlcpte pffeflravf tevaelfpge ylhvggdevk 361 kvqwqqspfv telmqreglk dyhevqsyfi crvgeivssl dkkmlgwnei ldggiapnat 421 imswqgvegg iaaaelghda imspgnyvyf dhfqsrsvde plaihgitpl setysynpmp 481 eqfagtekak hilgaqgqlw teyvpttaka eymilprlsa vaeitwtpvn kqswqsfser 541 lpslfarfde mglnaarsvy aitatakteg sgedakyrvn lasdtahvii ryttdgtlpn 601 aqspiysepf lvegdtfvra rsqdkisgnf ylesqlrtvk hkavgakltl lseantewnk 661 dpvktlsdgi tsidqifqld dwatffgdev vahitfakaq tvsevsigfn pgkhrqmypp 721 trlhilsssd getwqslgea dpqhlatakn rvsytfaptt trhlrieaen ktrvlstesg 781 klksvplyld eiivk HexB polypeptide sequence 1 malfskyvwq vavagalgtv sllgsrlyaq tadtqqwidg iasnmqvhyq vllnkgdgec SEQ ID NO: 15 61 slpslppspk spcsivelsl sspdklaand ldgnwsiyfs qtdpiyahpa geftidhing 121 dlhrirpsas yqgfnvgevk kvqfivaglt lteakimpny yvvaegqdnk qalysearvi 181 estririhpe tgleerpfag eisrqnfkls qadktpyada afifnenknv nklgfvaqde 241 alrtiiptpt fvmdsgknid isaginlqlq gveqdavapa lawiqalglk qnpagmpfvv 301 svsraslpsr spvgsyqlvv sptqitifar epvgafygmq slasvmiagr ntlpvltvnd 361 sprypyrgmh idvgrnfhsk qqildvldqm aayklnklhl hlgedegwrl qipslpeltd 421 vggkrchdpq entcllmqlg advsgkserd gyytrqdyie lvkaanarhi qlipsfdmpg 481 hsravikame aryrkfmaag nkkaaeqyll sdpndktqyk siqfysdnti nacmespykf 541 lgkvidevka mhseagqplt vyhigadeta gawaqspicq affannpygv enakqlgayf 601 iervaallet kgiktagwsd glshtnpknm pakvqsyiwd vlpwggvaea nkqanrgwdv 661 vlshpdalyf dfpyepdpke ggyywgsrhi dthkvfnymp gnlpalaevy psptqtgfei 721 agttplkqgv qwagiqgqlw setirsdnav eymifprlia laerawhaps weppynyega 781 tynansglfs enkkserdka wlkfasvigy kefvkldaad ihyriptvga iiqdsmlhan 841 laypglgiey keagkdwqpy nkpvqvktpv lvrakaatgd rkgralpve HexC polypeptide sequence 1 mletnnqllg pviadiagqt lsdedialik npligglilf trnystpsql dalvkqirsv SEQ ID NO: 16 61 radiilavdh eggrvqrfre gftripamqv fasaykarae ltlalacntg wlmaselray 121 didisfapvl dvddsfssii gdrafssdpk avtalagafi dgmqqagmac tgkhfpghgs 181 vradshlelp vdyrsleaie qldlmpfakl qskldavmpa hilfpevddq pvgfssvwlq 241 kilrdkmayd gvifsddltm egaavagsfg eraikamsag cdtllvcnnr eatleviqtl 301 adngnystsi rltrmrgkag aqpiydlhnn krwqetkeal lala GH18_(N) polypeptide sequence 1 nagvdaelgy sigdvyapsf dytavggerk tdnyrvigyy mpsldgsfpp saigeqqaqm SEQ ID NO: 17 61 lthinyafig insqlecdfi dvekadaetq iiaelqalkn wnadlkilfs vggwaesnda 121 aetvsryrda fapanrehfv sscvafmqqh gfdgididwe ypraedvdnf iaglaamrnq 181 ldargngelv tiagaggaff lsryysklaa iveqldfinl mtydlngpwn gvtktnfhah 241 lygnnqeprf ynalreadlg ltweeiverf pspfeltvda aikqhlmmdi prekivmgvp 301 fygraffntg ssntglyqff ntpngdpyvg dasllvgcea ceargepria tfndiqqlie 361 gnywytrhfd dqtkapwlyh aennifvtyd daqslvyktd yikqqglgga mfhlgqddsq 421 ftllatlhte lnganagslq ggnsetdntt detegnnedn te GH18_(N) polynucleotide sequence 1 tagttcatca tcaagctcta gtagctcaag ttcgtcttct agttcatcgt caagttcttc SEQ ID NO: 18 61 aagctcttct agttcatcaa gcacgggcgg tggcaatgcg ggtgtagatg ccgaattggg 121 ttacagcatt ggcgacgtct atgcgccaag ctttgattac accgcagtag gcggcgagcg 181 caaaacagat aactaccgcg ttattggcta ttacatgcca agtttagatg gttcgtttcc 241 gcctagcgca attggtgagc aacaagcgca aatgcttacc catattaact atgcatttat 301 tggtattaac agccagctag agtgcgattt tatagatgta gaaaaagccg acgcagaaa 361 ctcaaattatt gctgagttac aagcactaaa aaattggaat gccgatttaa aaatcctttt 421 ttctgtaggg ggttgggcag aatctaacga cgcagccgaa accgttagcc gctaccgcga 481 tgcgtttgca ccggcaaacc gcgagcattt tgttagctcg tgtgtagcct ttatgcaaca 541 acacggcttt gatggcatag atatagattg ggaataccct cgcgccgaag atgtagataa 601 ctttattgcc ggcctagcag caatgcgcaa ccaattggat gcacgcggca acggcgagct 661 agttaccatt gctggcgcag gcggtgcgtt ctttttaagc cgttattaca gcaagctagc 721 tgccatagta gaacagttag actttataaa tttaatgacc tacgacctaa acggaccgtg 781 gaacggcgta acaaaaacta actttcacgc acacctgtac ggcaacaacc aagagccgcg 841 cttttacaac gcgctgcgcg aagcagacct tggtttaacg tgggaagaaa tagtagagcg 901 ttttcctagc ccgttcgagc tcaccgtaga tgccgccatt aaacaacatt taatgatgga 961 tattccgcgc gaaaaaattg taatgggcgt acctttttac ggtcgtgcat tttttaacac 1021 aggttcatca aacaccggtt tataccaaac ctttaacacc ccaaatggtg acccctatgt 1081 aggtgacgct agcttattgg ttggttgtga agcctgcgaa gcgcgcggcg agccacgcat 1141 tgctaccttt aacgatattc aacaacttat agaaggtaac tacggctata cccgtcactt 1201 tgatgatcaa accaaagcgc cttggttgta tcacgcagaa aataatatat ttgtaaccta 1261 cgacgatgct caatcgttgg tgtataaaac cgattatatt aaacaacaag gtttaggcgg 1321 tgcgatgttt tggcacctag gccaagatga ttcgcaattt actttattgg ctactttaca 1381 caccgagcta aacggcgcaa acgctggtag cctgcaaggt ggcaatagcg aaaccgacaa 1441 cacaacggac gaaacag GH18_(c) polypeptide sequence 1 dlsgvvygey nntykqtsdk iivtyfvewg iygrdyhvnn ipasnlthvl fgfiamcgdn SEQ ID NO: 19 61 phasggaqaa iasecadkqd fevtlvdrfa nlektypgdt wyddttgqdy ngnfgqlrkl 121 kaqhphlkil psiggwtmst pfyemaknea nravtvesav nfikkydffd gvdidweypv 181 yggtapelst aadrdaytal mrdlraalde laeetgreye itsavgaape kiaavdyasa 241 ttymdyiflm sydymgawan ttghhtplyn nneeregfnt hasvqnllta gvpssklvvg 301 gafygrgwvg tqntnaaksd lfplygqasg aakgtweagv qdyrdlydny igtngtging 361 fsahydeiae aaylwnsstg efisydsprs iaakadyvkq ynlagmltwe idgdngqlln 401 ainesfgnek q GH18_(c) polynucleotide sequence 1 ggcgaaacag atttatctgg cgtggtttac ggcgagtaca acaacactta caaacaga SEQ ID NO: 20 61 cgagcgataa aataattgtt acttactttg tagagtgggg catttatggc cgcgactatc 121 acgtaaataa tattccggcg tctaacctta cgcacgtact gtttggcttt attgcaatgt 181 gtggcgataa cccacacgcc tcaggcggcg cgcaagcggc tattgctagc gagtgtgcag 241 ataagcaaga ttttgaagtt accttggtag atcgtttcgc caacctagaa aaaacttacc 301 caggcgatac gtggtacgac gatacaaccg gtcaagatta caatggtaac tttgggcaac 361 tacgcaaact aaaagcacag cacccgcatt taaaaatatt gccatctatt ggcggctgga 421 caatgtctac cccattttat gaaatggcaa aaaatgaagc taaccgcgca gtgtttgttg 481 aatctgccgt taactttatt aaaaaatatg acttcttcga cggagtagat atagattggg 541 aataccctgt atacggcggt acagccccag aattatctac cgctgccgac cgcgatgcct 601 ataccgcctt aatgcgtgac ctacgcgcag cattagacga gctggcagaa gaaacgggtc 661 gcgaatacga aattacttcg gccgtaggtg cagcaccaga aaaaattgca gcagtagatt 721 acgccagtgc cacaacgtat atggattaca tattcctaat gagctacgac tacatgggcg 781 catgggcgaa cacaacgggt caccacaccc cgctgtacaa caacaacgaa gagcgagaag 841 gttttaacac acatgcgtct gtgcaaaacc tattaaccgc aggtgtgcct tcatccaaat 901 tagtcgtggg tggtgcattc tacggccgcg gctgggtagg cacccaaaat accaacgctg 961 ccaaaagcga tttattcccg ctatatggcc aagcttctgg cgcggcaaaa ggcacctggg 1021 aagcaggggt acaagactac cgcgacctgt acgacaacta tattggcacc aatggcacag 1081 gcattaatgg ctttagcgca cactacgacg aaatagccga agccgcctac ctttggaaca 1141 gcagcaccgg cgaatttata agctacgatt cgccgcgctc tattgcagca aaagccgatt 1201 acgtaaaaca atacaatcta gctggcatgc taacctggga aatagacggc gataacggcc 1261 aactactcaa cgccattaac gaaagtttcg gcaacgaaaa gcagtag 

1. An isolated polynucleotide comprising the nucleotide sequence of at least one of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:18, or SEQ ID NO:20.
 2. An isolated polynucleotide encoding a polypeptide comprising the amino acid sequence of at least one of SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:17, or SEQ ID NO:19.
 3. An isolated polypeptide comprising the amino acid sequence of at least one of SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:17, or SEQ ID NO:19.
 4. An isolated polynucleotide encoding a polypeptide having chitin depolymerase activity.
 5. An isolated polynucleotide encoding a polypeptide having chitodextrinase activity.
 6. An isolated polynucleotide encoding a polypeptide having N-acetyl-D-glucosaminidase activity.
 7. An isolated polynucleotide encoding a polypeptide having chitin binding activity.
 8. An isolated polynucleotide complementary to at least one of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:18, or SEQ ID NO:20 under a stringency condition of from 1×SSC to 10×SSC.
 9. A chimeric gene comprising at least one polynucleotide encoding a polypeptide comprising the amino acid sequence of at least one of SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:17, or SEQ ID NO:19.
 10. The chimeric gene of claim 9, wherein the at least one polynucleotide is selected from SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:18, or SEQ ID NO:20, and wherein the gene is operably linked to regulatory sequences that allow expression of the amino acid sequence in a host cell.
 11. The chimeric gene of claim 9 contained in a host cell.
 12. The chimeric gene of claim 11, wherein the host cell is an Escherichia coli cell.
 13. A vector comprising the chimeric gene of claim
 9. 14. A vector comprising at least one polynucleotide encoding a polypeptide comprising the amino acid sequence of at least one of SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:17, or SEQ ID NO:19.
 15. The vector of claim 14, wherein the at least one polynucleotide is selected from SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:18, or SEQ ID NO:20.
 16. A prokaryote comprising at least one polynucleotide encoding a polypeptide having chitin depolymerase activity, chitodextrinase activity, N-acetyl-D-glucosaminidase activity, or chitin binding activity.
 17. The prokaryote of claim 16, wherein the prokaryote is Escherichia coli.
 18. An isolated polypeptide comprising at least two domains, wherein the domains are separated by a poly-amino acid linker.
 19. The isolated polypeptide of claim 18, wherein at least 90% of the amino acids in the poly-amino acid linker are serines.
 20. The isolated polypeptide of claim 19, wherein at least 80% of the amino acids in the poly-amino acid linker are serines.
 21. The isolated polypeptide of claim 20, wherein at least 70% of the amino acids in the poly-amino acid linker are serines.
 22. A method for breaking at least one bond between glucosamine units in a chitooligosaccharide comprising applying to the chitooligosaccharide a composition comprising at least one polypeptide that binds to the chitooligosaccharide.
 23. The method of claim 22, wherein chitooligosaccharide is a component of an insoluble complex polysaccharide and the method comprises breaking more than one bond.
 24. The method of claim 22, wherein the polypeptide that binds to the chitooligosaccharide is selected from SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:17, or SEQ ID NO:19.
 25. A method for identifying at least one nucleotide sequence encoding a polypeptide comprising at least one of chitin depolymerase activity, chitodextrinase activity, N-acetyl-D-glucosaminidase activity, or chitin binding activity from M. degradans, the method comprising constructing an M. degradans genomic library in E. coli and screening the library for at least one of chitin depolymerase activity, chitodextrinase activity, N-acetyl-D-glucosaminidase activity, or chitin binding activity.
 26. A method of treating asthma comprising administering a composition comprising at least one of SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:17, or SEQ ID NO:19. 